Sunday 1 July 2012

MongoDB and Solr


  1. MongoDB is document oriented architecture like Solr(but without schema). Columns can be dynamically added on fly like HBase.But the documents should not exceed 4MB. In that case if we need to more than 4MB then we need to go for FSGrid
  2. Data of MongoDB are not stored in HDFS(Distributed File System) they are stored in disk. For scalability they have shards(like Solr) And MongoDB also has feature called Replica(most of peoples using replica(secondary) for querying and primary for inserting/updating/deleting
  3. Retrieving the documents from MongoDB is relatively fast than solr(Because it does not have IDF,Analysers,Scoring etc). The data are stored memcache(Memory Mapped File) then later it will be moved to persistent storages (Disk)
  4. MongoDB supports Map-reduce . But the people who are using mongoDB in meetup said that "Processing data using Map-Reduce is painful" and they mention that map-reduce is not matured in MongoDB . In MongoDB there are lot of limitation for Map-reduce
  5. In MongoDB duplication can be avoided by giving DBRef(like Foreign key) But even though when one record is deleted from primary table . Deletion of the child table have to be deal in Middle tier only.
  6. MongoDB is durable and reliable.
  7. They have suit for Geo spacial which exactly like solr . I excepted that they will have facility to give multiple Lat-Lon(More than 6 Lat-Lon) .But people said they where not sure about that. They said on next meetup they will be saying.
  8. Meetup people said that for analytical they are using post-grades/MySQL(not mongoDB) . To make post-grades/MySQL faster they had remove security checks and lot of plugin (which will act like Name values pair(with defined Schema)) and then they were deploying post-grades/MySQL . This is what Facebook and other peoples are using( We can try this if it give any performance or it solves any of our problems)
  9. MongoDB can be replacement for RDBMS ( But transaction are not supported).
  10. In case if Bigdata . MongoDB wont be better choice . Because there is lot of possibility to getting crashed(if the data grows (due memcache)
  11. MongoDB is good really good but it is depends on use case. The people who came there where only dealing the data in GB's . So its good for them .
    Indexing the data in MongoDB is faster than  solr .Because there is no IDF or scoring

Please have a look at this site:

http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis

Best Place For HBASE: When you use the Hadoop/HDFS stack. When you need random, realtime read/write access to BigTable-like data.

For example: For data that's similar to a search engine's data



Let know if there is any queries.. if there is any mistake