So why did Aadhar engage with MongoDB in the first place and will it continue working with the startup?
Sudhir Narayana, assistant director general at Aadhar's technology center, told me that MongoDB was among several database products, apart from MySQL, Hadoop and HBase, originally procured for running the database search. Unlike MySQL, which could only store demographic data, MongoDB was able to store pictures.
However, Aadhar has been slowly shifting most of its database related work to MySQL, after realizing that MongoDB was not being able to cope with massive chunks of data, millions of packets.
Web/app projects these days often have many distributed parts. It's not uncommon for groups to use the right tool for the job. The right tools are often something like the choice below.
Redis for queuing, and caching.
Elastic Search for searching, and log stash.
Influxdb or RRD for timeseries.
S3 for an object store.
PostgreSQL for relational data with constraints, and validation via schemas.
Celery for job queues.
Kafka for a buffer of queues or stream processing.
Exception logging with PostgreSQL (perhaps using Sentry)
KDB for low latency analytics on your column oriented data.
Mongo/ZODB for storing documents JSON (or mangodb for /dev/null replacement)
SQLite for embedded.
Neo4j for graph databases.
RethinkDB for your realtime data, when data changes, other parts 'react'.
...
For all the different nodes this could easily cost thousands a month, require lots of ops knowledge and support, and use up lots of electricity. To set all this up from scratch could cost one to four weeks of developer time depending on if they know the various stacks already. Perhaps you'd have ten nodes to support.
Could you gain an ops advantage by using only PostgreSQL?
PostgreSQL is an open source multi-purpose relational database system which is widely used throughout the world. It is one huge system with the integrated subsystems, each of which has a particular complex feature and works with each other cooperatively. Although understanding of the internal mechanism is crucial for both administration and integration using PostgreSQL, its hugeness and complexity prevent it.
In the modern era, software is commonly delivered as a service: called web apps, or software-as-a-service. The twelve-factor app is a methodology for building software-as-a-service apps that:
- Use declarative formats for setup automation, to minimize time and cost for new developers joining the project;
- Have a clean contract with the underlying operating system, offering maximum portability between execution environments;
- Are suitable for deployment on modern cloud platforms, obviating the need for servers and systems administration;
- Minimize divergence between development and production, enabling continuous deployment for maximum agility;
- And can scale up without significant changes to tooling, architecture, or development practices.
The twelve-factor methodology can be applied to apps written in any programming language, and which use any combination of backing services (database, queue, memory cache, etc).