Which database to store very large nested Python dicts?











up vote
3
down vote

favorite
1












My script produces data in the following format:



dictionary = {
(.. 42 values: None, 1 or 2 ..): {
0: 0.4356, # ints as keys, floats as values
1: 0.2355,
2: 0.4352,
...
6: 0.6794
},
...
}


where:





  • (.. 42 values: None, 1 or 2 ..) is a game state

  • inner dict stores calculated values of actions which are possible in that state


The problem is that the state space is very big (millions of states), so the whole data stucture cannot be stored in memory. That's why I'm looking for a database engine which would fit my needs and I could use with Python. I need to get the list of actions and their values in the given state (previously mentioned tuple of 42 values) and to modify value of given action in given state.










share|improve this question
























  • It looks like your dictionary is one level deep. Any key-value store or an SOL database would fit your problem.
    – 9000
    Dec 28 '15 at 1:01

















up vote
3
down vote

favorite
1












My script produces data in the following format:



dictionary = {
(.. 42 values: None, 1 or 2 ..): {
0: 0.4356, # ints as keys, floats as values
1: 0.2355,
2: 0.4352,
...
6: 0.6794
},
...
}


where:





  • (.. 42 values: None, 1 or 2 ..) is a game state

  • inner dict stores calculated values of actions which are possible in that state


The problem is that the state space is very big (millions of states), so the whole data stucture cannot be stored in memory. That's why I'm looking for a database engine which would fit my needs and I could use with Python. I need to get the list of actions and their values in the given state (previously mentioned tuple of 42 values) and to modify value of given action in given state.










share|improve this question
























  • It looks like your dictionary is one level deep. Any key-value store or an SOL database would fit your problem.
    – 9000
    Dec 28 '15 at 1:01















up vote
3
down vote

favorite
1









up vote
3
down vote

favorite
1






1





My script produces data in the following format:



dictionary = {
(.. 42 values: None, 1 or 2 ..): {
0: 0.4356, # ints as keys, floats as values
1: 0.2355,
2: 0.4352,
...
6: 0.6794
},
...
}


where:





  • (.. 42 values: None, 1 or 2 ..) is a game state

  • inner dict stores calculated values of actions which are possible in that state


The problem is that the state space is very big (millions of states), so the whole data stucture cannot be stored in memory. That's why I'm looking for a database engine which would fit my needs and I could use with Python. I need to get the list of actions and their values in the given state (previously mentioned tuple of 42 values) and to modify value of given action in given state.










share|improve this question















My script produces data in the following format:



dictionary = {
(.. 42 values: None, 1 or 2 ..): {
0: 0.4356, # ints as keys, floats as values
1: 0.2355,
2: 0.4352,
...
6: 0.6794
},
...
}


where:





  • (.. 42 values: None, 1 or 2 ..) is a game state

  • inner dict stores calculated values of actions which are possible in that state


The problem is that the state space is very big (millions of states), so the whole data stucture cannot be stored in memory. That's why I'm looking for a database engine which would fit my needs and I could use with Python. I need to get the list of actions and their values in the given state (previously mentioned tuple of 42 values) and to modify value of given action in given state.







python database python-2.7 dictionary






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Dec 27 '15 at 19:27









timgeb

47.3k116287




47.3k116287










asked Dec 27 '15 at 18:37









Luke

729628




729628












  • It looks like your dictionary is one level deep. Any key-value store or an SOL database would fit your problem.
    – 9000
    Dec 28 '15 at 1:01




















  • It looks like your dictionary is one level deep. Any key-value store or an SOL database would fit your problem.
    – 9000
    Dec 28 '15 at 1:01


















It looks like your dictionary is one level deep. Any key-value store or an SOL database would fit your problem.
– 9000
Dec 28 '15 at 1:01






It looks like your dictionary is one level deep. Any key-value store or an SOL database would fit your problem.
– 9000
Dec 28 '15 at 1:01














4 Answers
4






active

oldest

votes

















up vote
1
down vote













Check out ZODB: http://www.zodb.org/en/latest/



It's natve object DB for Python that supports transactions, caching, pluggable layers, pack operations (for keeping history) and BLOBs.






share|improve this answer




























    up vote
    1
    down vote













    You can use a key-value cache solution. A good one is Redis. It`s very fast and simple, written on the C and more over than just a key value cache. Integration with python just several lines of code. The redis is also can be scaled very easy for the really big data. I worked in the game industry and understand what I am talking about.



    Also, as already mentioned here, you can use more complex solution, not a cache, the database PostgresSQL. Now it supports a JSON binary format field - JSONB. I think the best python database ORM is the SQLAlchemy. It supports PostgresSQL out of the box. I will use this one in my code block. For example, you have a table



    class MobTable(db.Model):
    tablename = 'mobs'

    id = db.Column(db.Integer, primary_key=True)
    stats = db.Column(JSONB, index=True, default={})


    If your have a mob with such json stats



    {
    id: 1,
    title: 'UglyOrk',
    resists: {cold: 13}
    }


    You can search all mobs with the not null cold resists



    expr = MobTable.stats[("resists", "cold")]
    q = (session.query(MobTable.id, expr.label("cold_protected"))
    .filter(expr != None)
    .all())





    share|improve this answer























    • I want to store a python dict in redis but as i am getting data every 15 minutes so i also need to update the redis every 15 minutes. Can you help me how I can achieve that? I mean, How i can dynamically assign a key every time i write data in redis as when i am using the same key.. it is overwriting the old data. If you can answer my question here on stackoverflow :- stackoverflow.com/questions/52712982/…
      – ak333
      Oct 9 at 13:02


















    up vote
    1
    down vote













    I recommend you use HD5f. It's a data base format that works perfectly with Python (it is specifically developed for Python) and stores the data in binary format. This reduces the size of the data to be stored a great extent! More importantly it gives you the ability of random access which I believe serves for your purposes. Also, if you do not use any compression method you will retrieve the data with the highest possible speed.






    share|improve this answer























    • Double indexing did not work for me, e.g. group[state][1]. I can't also assign a dict, e.g. group[state] = { 0: 0.43, .., 6: 0.65 }.
      – Luke
      Dec 28 '15 at 8:08


















    up vote
    0
    down vote













    You can also store it as JSONB in PostgreSQL DB.



    For connecting with PostgreSQL you can use psycopg2, which is compliant with Python Database API Specification v2.0.






    share|improve this answer





















      Your Answer






      StackExchange.ifUsing("editor", function () {
      StackExchange.using("externalEditor", function () {
      StackExchange.using("snippets", function () {
      StackExchange.snippets.init();
      });
      });
      }, "code-snippets");

      StackExchange.ready(function() {
      var channelOptions = {
      tags: "".split(" "),
      id: "1"
      };
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function() {
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled) {
      StackExchange.using("snippets", function() {
      createEditor();
      });
      }
      else {
      createEditor();
      }
      });

      function createEditor() {
      StackExchange.prepareEditor({
      heartbeatType: 'answer',
      convertImagesToLinks: true,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: 10,
      bindNavPrevention: true,
      postfix: "",
      imageUploader: {
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      },
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      });


      }
      });














      draft saved

      draft discarded


















      StackExchange.ready(
      function () {
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f34483567%2fwhich-database-to-store-very-large-nested-python-dicts%23new-answer', 'question_page');
      }
      );

      Post as a guest















      Required, but never shown

























      4 Answers
      4






      active

      oldest

      votes








      4 Answers
      4






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes








      up vote
      1
      down vote













      Check out ZODB: http://www.zodb.org/en/latest/



      It's natve object DB for Python that supports transactions, caching, pluggable layers, pack operations (for keeping history) and BLOBs.






      share|improve this answer

























        up vote
        1
        down vote













        Check out ZODB: http://www.zodb.org/en/latest/



        It's natve object DB for Python that supports transactions, caching, pluggable layers, pack operations (for keeping history) and BLOBs.






        share|improve this answer























          up vote
          1
          down vote










          up vote
          1
          down vote









          Check out ZODB: http://www.zodb.org/en/latest/



          It's natve object DB for Python that supports transactions, caching, pluggable layers, pack operations (for keeping history) and BLOBs.






          share|improve this answer












          Check out ZODB: http://www.zodb.org/en/latest/



          It's natve object DB for Python that supports transactions, caching, pluggable layers, pack operations (for keeping history) and BLOBs.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Dec 27 '15 at 23:04









          siefca

          7751013




          7751013
























              up vote
              1
              down vote













              You can use a key-value cache solution. A good one is Redis. It`s very fast and simple, written on the C and more over than just a key value cache. Integration with python just several lines of code. The redis is also can be scaled very easy for the really big data. I worked in the game industry and understand what I am talking about.



              Also, as already mentioned here, you can use more complex solution, not a cache, the database PostgresSQL. Now it supports a JSON binary format field - JSONB. I think the best python database ORM is the SQLAlchemy. It supports PostgresSQL out of the box. I will use this one in my code block. For example, you have a table



              class MobTable(db.Model):
              tablename = 'mobs'

              id = db.Column(db.Integer, primary_key=True)
              stats = db.Column(JSONB, index=True, default={})


              If your have a mob with such json stats



              {
              id: 1,
              title: 'UglyOrk',
              resists: {cold: 13}
              }


              You can search all mobs with the not null cold resists



              expr = MobTable.stats[("resists", "cold")]
              q = (session.query(MobTable.id, expr.label("cold_protected"))
              .filter(expr != None)
              .all())





              share|improve this answer























              • I want to store a python dict in redis but as i am getting data every 15 minutes so i also need to update the redis every 15 minutes. Can you help me how I can achieve that? I mean, How i can dynamically assign a key every time i write data in redis as when i am using the same key.. it is overwriting the old data. If you can answer my question here on stackoverflow :- stackoverflow.com/questions/52712982/…
                – ak333
                Oct 9 at 13:02















              up vote
              1
              down vote













              You can use a key-value cache solution. A good one is Redis. It`s very fast and simple, written on the C and more over than just a key value cache. Integration with python just several lines of code. The redis is also can be scaled very easy for the really big data. I worked in the game industry and understand what I am talking about.



              Also, as already mentioned here, you can use more complex solution, not a cache, the database PostgresSQL. Now it supports a JSON binary format field - JSONB. I think the best python database ORM is the SQLAlchemy. It supports PostgresSQL out of the box. I will use this one in my code block. For example, you have a table



              class MobTable(db.Model):
              tablename = 'mobs'

              id = db.Column(db.Integer, primary_key=True)
              stats = db.Column(JSONB, index=True, default={})


              If your have a mob with such json stats



              {
              id: 1,
              title: 'UglyOrk',
              resists: {cold: 13}
              }


              You can search all mobs with the not null cold resists



              expr = MobTable.stats[("resists", "cold")]
              q = (session.query(MobTable.id, expr.label("cold_protected"))
              .filter(expr != None)
              .all())





              share|improve this answer























              • I want to store a python dict in redis but as i am getting data every 15 minutes so i also need to update the redis every 15 minutes. Can you help me how I can achieve that? I mean, How i can dynamically assign a key every time i write data in redis as when i am using the same key.. it is overwriting the old data. If you can answer my question here on stackoverflow :- stackoverflow.com/questions/52712982/…
                – ak333
                Oct 9 at 13:02













              up vote
              1
              down vote










              up vote
              1
              down vote









              You can use a key-value cache solution. A good one is Redis. It`s very fast and simple, written on the C and more over than just a key value cache. Integration with python just several lines of code. The redis is also can be scaled very easy for the really big data. I worked in the game industry and understand what I am talking about.



              Also, as already mentioned here, you can use more complex solution, not a cache, the database PostgresSQL. Now it supports a JSON binary format field - JSONB. I think the best python database ORM is the SQLAlchemy. It supports PostgresSQL out of the box. I will use this one in my code block. For example, you have a table



              class MobTable(db.Model):
              tablename = 'mobs'

              id = db.Column(db.Integer, primary_key=True)
              stats = db.Column(JSONB, index=True, default={})


              If your have a mob with such json stats



              {
              id: 1,
              title: 'UglyOrk',
              resists: {cold: 13}
              }


              You can search all mobs with the not null cold resists



              expr = MobTable.stats[("resists", "cold")]
              q = (session.query(MobTable.id, expr.label("cold_protected"))
              .filter(expr != None)
              .all())





              share|improve this answer














              You can use a key-value cache solution. A good one is Redis. It`s very fast and simple, written on the C and more over than just a key value cache. Integration with python just several lines of code. The redis is also can be scaled very easy for the really big data. I worked in the game industry and understand what I am talking about.



              Also, as already mentioned here, you can use more complex solution, not a cache, the database PostgresSQL. Now it supports a JSON binary format field - JSONB. I think the best python database ORM is the SQLAlchemy. It supports PostgresSQL out of the box. I will use this one in my code block. For example, you have a table



              class MobTable(db.Model):
              tablename = 'mobs'

              id = db.Column(db.Integer, primary_key=True)
              stats = db.Column(JSONB, index=True, default={})


              If your have a mob with such json stats



              {
              id: 1,
              title: 'UglyOrk',
              resists: {cold: 13}
              }


              You can search all mobs with the not null cold resists



              expr = MobTable.stats[("resists", "cold")]
              q = (session.query(MobTable.id, expr.label("cold_protected"))
              .filter(expr != None)
              .all())






              share|improve this answer














              share|improve this answer



              share|improve this answer








              edited Dec 28 '15 at 9:05

























              answered Dec 27 '15 at 21:28









              theodor

              1,09411025




              1,09411025












              • I want to store a python dict in redis but as i am getting data every 15 minutes so i also need to update the redis every 15 minutes. Can you help me how I can achieve that? I mean, How i can dynamically assign a key every time i write data in redis as when i am using the same key.. it is overwriting the old data. If you can answer my question here on stackoverflow :- stackoverflow.com/questions/52712982/…
                – ak333
                Oct 9 at 13:02


















              • I want to store a python dict in redis but as i am getting data every 15 minutes so i also need to update the redis every 15 minutes. Can you help me how I can achieve that? I mean, How i can dynamically assign a key every time i write data in redis as when i am using the same key.. it is overwriting the old data. If you can answer my question here on stackoverflow :- stackoverflow.com/questions/52712982/…
                – ak333
                Oct 9 at 13:02
















              I want to store a python dict in redis but as i am getting data every 15 minutes so i also need to update the redis every 15 minutes. Can you help me how I can achieve that? I mean, How i can dynamically assign a key every time i write data in redis as when i am using the same key.. it is overwriting the old data. If you can answer my question here on stackoverflow :- stackoverflow.com/questions/52712982/…
              – ak333
              Oct 9 at 13:02




              I want to store a python dict in redis but as i am getting data every 15 minutes so i also need to update the redis every 15 minutes. Can you help me how I can achieve that? I mean, How i can dynamically assign a key every time i write data in redis as when i am using the same key.. it is overwriting the old data. If you can answer my question here on stackoverflow :- stackoverflow.com/questions/52712982/…
              – ak333
              Oct 9 at 13:02










              up vote
              1
              down vote













              I recommend you use HD5f. It's a data base format that works perfectly with Python (it is specifically developed for Python) and stores the data in binary format. This reduces the size of the data to be stored a great extent! More importantly it gives you the ability of random access which I believe serves for your purposes. Also, if you do not use any compression method you will retrieve the data with the highest possible speed.






              share|improve this answer























              • Double indexing did not work for me, e.g. group[state][1]. I can't also assign a dict, e.g. group[state] = { 0: 0.43, .., 6: 0.65 }.
                – Luke
                Dec 28 '15 at 8:08















              up vote
              1
              down vote













              I recommend you use HD5f. It's a data base format that works perfectly with Python (it is specifically developed for Python) and stores the data in binary format. This reduces the size of the data to be stored a great extent! More importantly it gives you the ability of random access which I believe serves for your purposes. Also, if you do not use any compression method you will retrieve the data with the highest possible speed.






              share|improve this answer























              • Double indexing did not work for me, e.g. group[state][1]. I can't also assign a dict, e.g. group[state] = { 0: 0.43, .., 6: 0.65 }.
                – Luke
                Dec 28 '15 at 8:08













              up vote
              1
              down vote










              up vote
              1
              down vote









              I recommend you use HD5f. It's a data base format that works perfectly with Python (it is specifically developed for Python) and stores the data in binary format. This reduces the size of the data to be stored a great extent! More importantly it gives you the ability of random access which I believe serves for your purposes. Also, if you do not use any compression method you will retrieve the data with the highest possible speed.






              share|improve this answer














              I recommend you use HD5f. It's a data base format that works perfectly with Python (it is specifically developed for Python) and stores the data in binary format. This reduces the size of the data to be stored a great extent! More importantly it gives you the ability of random access which I believe serves for your purposes. Also, if you do not use any compression method you will retrieve the data with the highest possible speed.







              share|improve this answer














              share|improve this answer



              share|improve this answer








              edited Nov 11 at 15:59

























              answered Dec 28 '15 at 0:51









              Amir

              4,62542548




              4,62542548












              • Double indexing did not work for me, e.g. group[state][1]. I can't also assign a dict, e.g. group[state] = { 0: 0.43, .., 6: 0.65 }.
                – Luke
                Dec 28 '15 at 8:08


















              • Double indexing did not work for me, e.g. group[state][1]. I can't also assign a dict, e.g. group[state] = { 0: 0.43, .., 6: 0.65 }.
                – Luke
                Dec 28 '15 at 8:08
















              Double indexing did not work for me, e.g. group[state][1]. I can't also assign a dict, e.g. group[state] = { 0: 0.43, .., 6: 0.65 }.
              – Luke
              Dec 28 '15 at 8:08




              Double indexing did not work for me, e.g. group[state][1]. I can't also assign a dict, e.g. group[state] = { 0: 0.43, .., 6: 0.65 }.
              – Luke
              Dec 28 '15 at 8:08










              up vote
              0
              down vote













              You can also store it as JSONB in PostgreSQL DB.



              For connecting with PostgreSQL you can use psycopg2, which is compliant with Python Database API Specification v2.0.






              share|improve this answer

























                up vote
                0
                down vote













                You can also store it as JSONB in PostgreSQL DB.



                For connecting with PostgreSQL you can use psycopg2, which is compliant with Python Database API Specification v2.0.






                share|improve this answer























                  up vote
                  0
                  down vote










                  up vote
                  0
                  down vote









                  You can also store it as JSONB in PostgreSQL DB.



                  For connecting with PostgreSQL you can use psycopg2, which is compliant with Python Database API Specification v2.0.






                  share|improve this answer












                  You can also store it as JSONB in PostgreSQL DB.



                  For connecting with PostgreSQL you can use psycopg2, which is compliant with Python Database API Specification v2.0.







                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Dec 28 '15 at 0:11









                  Marqin

                  829715




                  829715






























                      draft saved

                      draft discarded




















































                      Thanks for contributing an answer to Stack Overflow!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      To learn more, see our tips on writing great answers.





                      Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


                      Please pay close attention to the following guidance:


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function () {
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f34483567%2fwhich-database-to-store-very-large-nested-python-dicts%23new-answer', 'question_page');
                      }
                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown