Informal

Postgres to MongoDB Migrator tool - Pg2Mongo

Postgres to MongoDB Migrator tool - Pg2Mongo

Cross-database migrations have always been tedious and traumatic. We will just end up writing hours and hours of scripts to perform a task and it is useful for a single time only, which makes us think: “All this horsepower and no room to gallop?”

Fig1: Stuck

Postgres to MongoDB

There could be multiple situations when there could be a need to migrate data from a relational to a non-relational datasource.

  • There could be situations where there is a need for a database change
  • Legacy system change to microservices
  • Change made to reveal a high performance.

Performing the switch from relational to non-relational databases can be tedious. And Pg2Mongo is the solution to all the migration problems to migrate data from Postgres to MongoDB.

Pg2Mongo

Fig2: Pg2Mongo

Pg2Mongo is a opensource migration tool, written on python3 which gives exclusive control over cross-database migrations. The migration from relational to non-relational MongoDB can be done very easily by the following steps.

  • Make sure that you have access to your very own Postgres and MongoDB servers.
  • Clone the repository and install the requirements.

    git clone https://github.com/datawrangl3r/pg2mongo
    sudo pip3 install -r pg2mongo/requirements.txt
    
  • Modify pg2mongo.yml based on the server names and databases that you access

  • For demonstration purposes, we will be using the sample migration script in the repository sample_migration.sql.

    psql -f sample_migration.sql
    
  • Now that our sample data has been loaded, the script needs to understand how the transformation should occur. This can be mentioned in the pg2mongo.yml as

    EXTRACTION:
    HOST : localhost 
    USER : postgres
    PASSWORD : 
    DATABASE : fandom
    COMMIT:
    HOST : localhost
    USER :
    PASSWORD :
    DATABASE : fandom
    MIGRATION:
    INIT_TABLE: universes
    INIT_KEYS:
        - id as uv_id
        - universe as trax_universe
        - created_at as trax_created_at
    SKELETON : 
        - KEY1 = {}
        - KEY2 = {}
    TABLES_ORDER :
        - heroes
        - weapons
    TABLES :
        heroes :
            condition : universe_id = uv_id
                #Dictionary's key ==> Postgres Field of the corresponding schema.table1
            mapping   :
                - KEY1['hero_id'] = %s['hero_id']
                - KEY1['universe_id'] = %s['universe_id']
                - KEY1['name'] = %s['name']
                - KEY1['created_at'] = %s['created_at']
                - KEY1['weapons'] = []
        weapons :
            condition : hero_id = KEY1['hero_id']
            mapping   :
                - list:
                    - KEY1['weapons'].append({})
                    - KEY1['weapons'][-1]['weapon_name'] = %s['weapon_name']
                    - KEY1['weapons'][-1]['weapon_category'] = %s['weapon_category']
    COLLECTIONS :
        heroes : KEY1
        #Collection name <== skeleton
    

It might sound overwhelming but each component is explained in detail, as follows:

Component What it does
INIT_TABLE Initial table from which data needs to be migrated. This could be a prime table such as a transactions table with a primary key having multiple foreign constraints to other tables of the PostgreSQL database. FOR EACH ENTRY IN THIS TABLE, THE LINKING OF OTHER TABLES WILL HAPPEN WHILE DEFINING THE TABLES.
INIT_KEYS KEYS of the init_table (aliases can be given using ‘AS’)
SKELETON Skeleton is an empty raw python dictionary assignment which will transform to a MongoDB document, upon migration
TABLES_ORDER The order by which the TABLES section needs to be executed for each of the entry from INIT_TABLE
TABLES Set of PostgreSQL tables enlisted along with condition and corresponding mapping. In the case of lists inside a dictionary, a list can be mentioned. Mapping is where, the association of skeleton to the table keys is defined. The value assignments are python compatible; hence, they are defined by using ‘%s’ and other python based variable transformation functions can be used over here
COLLECTIONS This is where the push of the skeleton to the corresponding MongoDB collection takes place.
  • Now that the migration template is ready, let’s invoke the migration by running:

    $ python3 __init__.py
    

You may find that based on the specified collection and transformation details mentioned in the configuration file, the migration would have completed in the destination MongoDB.

Fig3: Relational to Non-Relational

comments powered by Disqus