librelist archives

« back to archive

MongoDB best practice question array or stdClass

MongoDB best practice question array or stdClass

From:
Loic d'Anterroches
Date:
2012-05-18 @ 09:09
Hello,

just wondering, how are you managing the output of MongoDB? Let say you
have users in the db, when you query Mongo, do you convert the array
from the output to a stdClass or a "YourUser" class?

I am on the way to opensource my library of Photon utilities and I can
see that I am not very consistent, I like to work with stdClass/objects
but what do *you* like?

loïc

Re: [photon.users] MongoDB best practice question array or stdClass

From:
William Martin
Date:
2012-05-18 @ 09:39
Hi Loic,

In my current project, i use a stdClass directly from mongo.
Too avoid password leak, my authentification backend remove password
and yubikey stuff after the user is loaded.

William

On Fri, May 18, 2012 at 11:09 AM, Loic d'Anterroches <loic@ceondo.com> wrote:
> Hello,
>
> just wondering, how are you managing the output of MongoDB? Let say you
> have users in the db, when you query Mongo, do you convert the array
> from the output to a stdClass or a "YourUser" class?
>
> I am on the way to opensource my library of Photon utilities and I can
> see that I am not very consistent, I like to work with stdClass/objects
> but what do *you* like?
>
> loïc



-- 
---------------------------------------------------------
William MARTIN
wysman @NoSpAm@ gmail @DoT@ com

Re: [photon.users] MongoDB best practice question array or stdClass

From:
Nicolas
Date:
2012-05-18 @ 10:05
Hello,

I store the data I retrieved from MongoDB (or any database/datastore in
fact) in well-defined objects. It tried various ways to handle that, and
I'm now convinced that using a specific class works best, at least with the
way the rest of my code is structured.

First of all, there are usually two kind of data about your user that you
store in MongoDB : core fields (which are mandatory), and peripheric fields
(which are secondary). My "User" class is usually built this way :

*It has explicit declaration of the mandatory fields it handles* (ie. id,
email, password, last activity date, ...)
Each field usually has specific business rules, but it's easy to have
methods to transform or validate the data you handle.

*It has explicit declaration for an array that will contain my secondary
data* (ie. posts list, pictures, friends, ...)
Iterator interfaces are your friend here.


Handling exceptions (I usualy have at least a "NoDataFound" and a
"DataAccessError" exceptions) in this context is a breeze. And it will get
even better with the addition of traits, that make it easy to share the
storage/retrieval and transformation/validation code between all my
data-filled classes.

The main downside my method has, is the lack of datastore abstraction in my
code. But the data access functions are well isolated and can be rewritten
without too much pain. And people don't change their datastore very often.

Don't get me wrong : using an StdClass instance works just fine. But a good
usage of PHP's OOP features make everything easier and shorter in that
context.

Nicolas




On Fri, May 18, 2012 at 11:09 AM, Loic d'Anterroches <loic@ceondo.com>wrote:

> Hello,
>
> just wondering, how are you managing the output of MongoDB? Let say you
> have users in the db, when you query Mongo, do you convert the array
> from the output to a stdClass or a "YourUser" class?
>
> I am on the way to opensource my library of Photon utilities and I can
> see that I am not very consistent, I like to work with stdClass/objects
> but what do *you* like?
>
> loïc
>

Re: [photon.users] MongoDB best practice question array or stdClass

From:
Loic d'Anterroches
Date:
2012-05-18 @ 12:18
Hello,

> I store the data I retrieved from MongoDB (or any database/datastore in
> fact) in well-defined objects. It tried various ways to handle that, and
> I'm now convinced that using a specific class works best, at least with
> the way the rest of my code is structured.

I follow you on everything you write, just one problem, when you find()
with MongoDB you get an associative array, you then need to convert it
to your object and then work on it.

This is basically an object mapped to a document. I found that I soon as
I started to work with 100k+ documents, this approach breaks with
respect to performance.

Basically, for a limited number of well defined not so complex
documents, this is perfect, but as soon as you get a large number of
complex documents, like in my case for Cheméo when you have a single
document storing all the data available here:

 http://chemeo.com/cid/45-039-9

this fails. The Doctrine approach of "document" mapping would require an
object per property (more or less) which means up to 1000+ objects per
document. Simply mad.

The way I am going at the moment is to have very "limited" objects for
convenience with a very limited number of methods (if any) but a well
defined structure and have the manipulation of the data logic in
libraries. A kind of functional programming approach.

This is not perfect, probably a limitation of my programming skills
combined with the limitations of the PHP language, but this is what is
for me the best in terms of performances.

A lot of "personal" taste too, this is in fact why I "duck type"
everything in Photon, to leave the users with the ability to follow its
own taste for all these non critical choices.

Thanks for the comments, this is helping me.

loïc





> First of all, there are usually two kind of data about your user that
> you store in MongoDB : core fields (which are mandatory), and peripheric
> fields (which are secondary). My "User" class is usually built this way :
> 
> *It has explicit declaration of the mandatory fields it handles* (ie.
> id, email, password, last activity date, ...)
> Each field usually has specific business rules, but it's easy to have
> methods to transform or validate the data you handle.
> 
> *It has explicit declaration for an array that will contain my secondary
> data* (ie. posts list, pictures, friends, ...)
> Iterator interfaces are your friend here.
> 
> 
> Handling exceptions (I usualy have at least a "NoDataFound" and a
> "DataAccessError" exceptions) in this context is a breeze. And it will
> get even better with the addition of traits, that make it easy to share
> the storage/retrieval and transformation/validation code between all my
> data-filled classes.
> 
> The main downside my method has, is the lack of datastore abstraction in
> my code. But the data access functions are well isolated and can be
> rewritten without too much pain. And people don't change their datastore
> very often.
> 
> Don't get me wrong : using an StdClass instance works just fine. But a
> good usage of PHP's OOP features make everything easier and shorter in
> that context.
> 
> Nicolas
> 
> 
> 
> 
> On Fri, May 18, 2012 at 11:09 AM, Loic d'Anterroches <loic@ceondo.com
> <mailto:loic@ceondo.com>> wrote:
> 
>     Hello,
> 
>     just wondering, how are you managing the output of MongoDB? Let say you
>     have users in the db, when you query Mongo, do you convert the array
>     from the output to a stdClass or a "YourUser" class?
> 
>     I am on the way to opensource my library of Photon utilities and I can
>     see that I am not very consistent, I like to work with stdClass/objects
>     but what do *you* like?
> 
>     loïc
> 
> 

-- 
Dr Loïc d'Anterroches
Founder Céondo Ltd

w: www.ceondo.com       |  e: loic@ceondo.com
t: +44 (0)207 183 0016  |  f: +44 (0)207 183 0124

Céondo Ltd
Dalton House
60 Windsor Avenue
London
SW19 2RR / United Kingdom

Re: [photon.users] MongoDB best practice question array or stdClass

From:
Nicolas
Date:
2012-05-18 @ 13:41
On Fri, May 18, 2012 at 2:18 PM, Loic d'Anterroches <loic@ceondo.com> wrote:

> Hello,
>
> > I store the data I retrieved from MongoDB (or any database/datastore in
> > fact) in well-defined objects. It tried various ways to handle that, and
> > I'm now convinced that using a specific class works best, at least with
> > the way the rest of my code is structured.
>
> I follow you on everything you write, just one problem, when you find()
> with MongoDB you get an associative array, you then need to convert it
> to your object and then work on it.
>
> This is basically an object mapped to a document. I found that I soon as
> I started to work with 100k+ documents, this approach breaks with
> respect to performance.
>
> Basically, for a limited number of well defined not so complex
> documents, this is perfect, but as soon as you get a large number of
> complex documents, like in my case for Cheméo when you have a single
> document storing all the data available here:
>
>  http://chemeo.com/cid/45-039-9
>
> this fails. The Doctrine approach of "document" mapping would require an
> object per property (more or less) which means up to 1000+ objects per
> document. Simply mad.
>

At that scale, my method probably is too slow and memory consuming.

Most of my sites have at most 10.000 documents. Some might be complex, but
still it seems I'm far from that scale.



> The way I am going at the moment is to have very "limited" objects for
> convenience with a very limited number of methods (if any) but a well
> defined structure and have the manipulation of the data logic in
> libraries. A kind of functional programming approach.
>

Working with the lightest possible objects is probably your best bet.
Especially if you really need to load and process that much data at every
request.

It seems to me you could probably save yourself from a lot of trouble by
pre-processing as much stuff as possible. Dont get me wrong : I'm not
pretending Chemeo is not well-thought. I'm just saying that your "problem"
with having a lot of objects to work with at every page generation might be
avoidable by storing, retrieving or caching your data differently. But I
can't say for sure since I don't know how Chemeo works and I basically suck
at chemistry :)



> This is not perfect, probably a limitation of my programming skills
> combined with the limitations of the PHP language, but this is what is
> for me the best in terms of performances.
>

Maybe it's just because it's a complicated problem. Lots of data to process
in a little time with little resources,..it will be challenging no matter
what language you choose.



> A lot of "personal" taste too, this is in fact why I "duck type"
> everything in Photon, to leave the users with the ability to follow its
> own taste for all these non critical choices.
>

Despite the fact I'm still a Photon beginner, that's actually one of the
big strengths I noticed in the framework. You need to understand how it
works but when you do, you start to realize that it easily adapts to your
own methods and practices.



> Thanks for the comments, this is helping me.
>


No problem.
Sorry for disgressing from the base topic :)

Nicolas