Skip to main content

Home/ Google AppEngine/ Group items tagged bulk

Rss Feed Group items tagged

Esfand S

Using the App Engine Mapper for bulk data import « Ikai Lan says - 0 views

  • The most obvious use case is data import. A developer looking to import large amounts of data would take the following steps: Create a CSV file containing the data you want to import. The assumption here is that each line of data corresponds to a datastore entity you want to create Upload the CSV file to the blobstore. You’ll need billing to be enabled for this to work. Create your Mapper, push it live and run your job importing your data. This isn’t meant to be a replacement for the bulk uploader tool; merely an alternative. This method requires a good amount more programmatic changes for custom data transforms. The advantage of this method is that the work is done on the server side, whereas the bulk uploader makes use of the remote API to get work done. Let’s get started on each of the steps.
  • to build Mappers that map across some large, contiguous piece of data as opposed to Entities in the datastore
Esfand S

Advanced Bulk Loading Part 5: Bulk Loading for Java - Nick's Blog - 0 views

  •  
    This article is outdated. There is a native java remote-api servelet now.
Esfand S

Arachnid's bulkupdate at master - GitHub - 0 views

  • The App Engine bulk update library is a library for the App Engine Python runtime, designed to facilitate doing bulk processing of datastore records. It provides a framework for easily performing operations on all the entities matched by any arbitrary datastore query, as well as providing an administrative interface to monitor and control batch update jobs.
Esfand S

Effectively Parallelizing Fetches (with pictures, yay!) - Google App Engine | Google Gr... - 0 views

  • As I understand it, the process of performing a single fetch (call to get())  from the dastastore using a key basically involves finding the host housing the entity, opening a socket, fetching the data, and then cleaning up the connection.  So to fetch something like 30 entities from the datastore, you're repeating the process 30 times over in serial, each time incurring whatever overhead is involved.  I also read that if you perform bulk fetches, (ie passing multiple keys at once) you can eliminate a great deal of that overhead.  In one of the videos I watched from Google I/0 2009, the presenter (whose name I forget - d'oh) said that performing a bulk fetch actually performs the fetches in parallel from the data store and you shoudl see requests noticeably faster.
Esfand S

Deleting entities in bulk. - Google App Engine | Google Groups - 0 views

  • class DeleteFull():     def execute(self):         deleting = model_class_name.all().order('__key__').fetch(100)         while deleting:             a = []             key = deleting[-1].key()             for item in deleting:                 a.append(item)             db.delete(a)             deleting = model_class_name.all().filter('__key__ >', key).order('__key__').fetch(100) This purged everything, but it took a hell of a long time.
Esfand S

How to upload primary key as an id instead of name - Google App Engine for Java | Googl... - 0 views

  • If you use the RemoteDatastore you have complete control in Java of   what you key is. http://code.google.com/p/remote-datastore/ e.g. this code puts an entity with the id set in your remote datastore   from your local machine:      RemoteDatastore.install();      RemoteDatastore.divert("http://myVersion.latest.myApp.appspot.com/remote-datastore ", "myApp", "myVersion");      DatastoreService service =   DatastoreServiceFactory.getDatastoreService();      Key key = KeyFactory.createKey("MyKindName, 35);      Entity entity1 = new Entity(key);      entity1.setProperty("property1", "hello");      datastore.put(Arrays.asList(entity1, entity2);
  • You can do this now using the RemoteDatastore Java utility http://code.google.com/p/remote-datastore/ For example, this code runs on your desktop and creates a single   entity in your live datastore:     // divert datastore operations to live application     RemoteDatastore.install();     RemoteDatastore.divert("http://myVersion.latest.myApp.appspot.com/remote-datastore ", "myApp", "myVersion");     // create an entity with a numeric key     Key key = KeyFactory.createKey("MyKindName, 35);     Entity entity1 = new Entity(key);     entity1.setProperty("property1", "hello");     // put entity to the remote datastore     DatastoreService service =   DatastoreServiceFactory.getDatastoreService();     datastore.put(entity1); This also works for bulk puts
  • this won't be available until 1.3.6. You should be able to do something like this:  - property: __key__    external_name: CityId    export_transform: datastore.Key.id    import_transform: lambda value: datastore.Key.from_path('City', int(value))
Esfand S

How to upload primary key as an id instead of name - Google App Engine for Java | Googl... - 0 views

  • You can do this now using the RemoteDatastore Java utility http://code.google.com/p/remote-datastore/ For example, this code runs on your desktop and creates a single   entity in your live datastore:     // divert datastore operations to live application     RemoteDatastore.install();     RemoteDatastore.divert("http://myVersion.latest.myApp.appspot.com/remote-datastore ", "myApp", "myVersion");     // create an entity with a numeric key     Key key = KeyFactory.createKey("MyKindName, 35);     Entity entity1 = new Entity(key);     entity1.setProperty("property1", "hello");     // put entity to the remote datastore     DatastoreService service =   DatastoreServiceFactory.getDatastoreService();     datastore.put(entity1); This also works for bulk puts
Esfand S

Using the bulkloader with Java App Engine « Ikai Lan says - 0 views

  • I’m trying to use the bulkuploader for a java program but am running into an interesting issue. My PrimaryKey property is a Long, and in java I can explicitly give them id numbers and they show in the data store as “id=xxx”. When I download the data via the appcfg.py I get a reasonably looking data file. If I reupload the same file it actually inserts things into the data store with key “name=xxx” and therefore doubles every one of my entries.
  • create a custom uploader using the file upload example provided on appengine’s java FAQ.
  • App Engine’s datastore is schemaless. That is – it is possible to have Entities of the same Kind with completely different sets of properties. Most of the time, this is a good thing. MySQL, for instance, requires a table lock to do a schema update. By being schema free, migrations can happen lazily, and application developers can check at runtime for whether a Property exists on a given Entity, then create or set the value as needed. But there are times when this isn’t sufficient. One use case is if we want to change a default value on Entities and grandfather older Entities to the new default value, but we also want the default value to possibly be null.
  • ...1 more annotation...
  • I used a combination of uploading entire chunks of my data via FileUpload (see link below), and explicitly creating my Java objects with the keys that I wanted (which were easily implicitly defined by the data format as the first one would be ‘n’ and every object after it was n++). I would then insert the set of objects in bulk. The problem I hit the most was finding the right number of objects per store call. There are specific limits that make this process long and annoying. I ran something locally that would continue trying to upload the chunk of data until it got a good response from the server page. It took me something on the order of 6-8 hours to upload about 1.5M tiny objects. http://code.google.com/appengine/kb/java.html#fileforms
Esfand S

KeyRange - 0 views

  • public final class KeyRangeextends java.lang.Objectimplements java.lang.Iterable<Key>, java.io.Serializable Represents a range of unique datastore identifiers from getStart().getId() to getEnd().getId() inclusive. The Keys returned by an instance of this class have been consumed in the datastore's id-space and are guaranteed never to be reused. This class can be used to construct Entities with Keys that have specific id values without fear of the datastore creating new records with those same ids at a later date. This can be helpful as part of a data migration or large bulk upload where you may need to preserve existing ids and relationships between entities. This class is threadsafe but the Iterators returned by iterator() are not.
Esfand S

Looking for Bulk Loader beta testers - Google App Engine | Google Groups - 0 views

  • for Java developers, there is a remote API handler available; adding following to your web xml file should work (disclaimer: I have not yet personally tested this.) <servlet>   <servlet-name>remoteapi</servlet-name>   <servlet-class>com.google.apphosting.utils.remoteapi.RemoteApiServlet</servlet-class> </servlet> <servlet-mapping>   <servlet-name>remoteapi</servlet-name>   <url-pattern>/remote_api</url-pattern> </servlet-mapping> <security-constraint>   <web-resource-collection>     <web-resource-name>remoteapi</web-resource-name>     <url-pattern>/remote_api</url-pattern>   </web-resource-collection>   <auth-constraint>     <role-name>admin</role-name>   </auth-constraint> </security-constraint>
1 - 20 of 28 Next ›
Showing 20 items per page