How to Do MongoDB Aggregation Queries Easily with 3T MongoChef

In this post we’re going to take a look at how to do MongoDB aggregation queries easily with the amazing new Aggregation Screen in 3T MongoChef.

Prefer to watch?
See the accompanying MongoDB Aggregation video.

For this tutorial, we’re going to build a query based on the freely available housing data from the City of Chicago Data Portal to learn how to use the incredible new features and query support the Aggregation Screen provides.

If you haven’t installed 3T MongoChef 3.0 already, it’s available for Windows, Mac and Linux here: http://3t.io/mongochef/

Creating the Aggregation Query

Once we’ve opened up 3T MongoChef and connected to the database, we can select the collection we wish to query:

Select Collection for MongoDB Aggregation Query

We can open the Aggregation Screen by clicking the large ‘Aggregate‘ button in the main tool bar at the top, selecting ‘Open Aggregation Screen‘ from the right-click context menu or by pressing the ‘F4‘ shortcut key.

We now have an empty aggregation query, ready to be filled up, so let’s get cracking!

New Pipeline MongoDB Aggregation Query

Identifying the Question We Want to Answer

The question we want to ask of our data is simple:

Which zip codes have the greatest number of senior housing units available?

To think how we’ll answer this and how we’ll form our query, let’s take a look at the data. Click ‘Execute full pipeline‘ (executing an empty pipeline simply shows the contents of the collection).

Full Pipeline Results MongoDB Aggregation Query

If you prefer a JSON view of the data (and 3T MongoChef supports dynamically switching between tree, table and JSON views of your result data), it’s included below:

{ 
    "_id" : ObjectId("544f9533d4c6dc758c28fde4"), 
    "community_area" : {
        "name" : "Albany Park", 
        "number" : 14
    }, 
    "property" : {
        "type" : "Senior", 
        "name" : "Mayfair Commons"
    }, 
    "address" : "4444 W. Lawrence Ave.", 
    "zip_code" : "60630", 
    "phone_number" : "773-205-7862", 
    "management_company" : "Metroplex, Inc.", 
    "units" : 97, 
    "location" : {
        "x_coordinate" : 1145674.7538177613, 
        "y_coordinate" : 1931569.979044555, 
        "latitude" : 41.9682242321, 
        "longitude" : -87.7397474866, 
        "description" : "4444 W Lawrence Ave\n(41.968224232060564, -87.73974748655358)"
    }
}

OK, so we can see we have the fields we need – we can check "property.type" to see that it’s senior housing, and "zip_code" and "units" give us the zip code and number of available units there are, respectively.

To answer our question, we need to combine these into the right aggregation query. Let’s create the first stage of our query where we’ll match against the senior property type.

Adding a New Stage

Click ‘Add New Stage‘ and you’ll see a new stage in the ‘Pipeline‘ tab.

New Stage MongoDB Aggregation Query

Double click the new stage to edit it (or simply select the ‘Stage 1‘ tab):

Match Operator MongoDB Aggregation Query

The screenshot above jumps ahead a little bit as the stage specification has already been filled, but let’s break down each piece in turn.

First, notice the ‘$match‘ in the combo box. It’s here where we select the stage’s ‘operator’. A stage operator defines what the stage actually does. The ‘$match‘ operator takes the input set of documents and outputs only those that match the given criteria. It is essentially a filter. A full list of the supported operators and their meaning is available here: http://docs.mongodb.org/manual/meta/aggregation-quick-reference/ (this link is always readily available by clicking ‘Operator Quick Reference’ in the app).

For convenience, the specification of the Stage 1 ‘$match‘ operator is repeated below:

{
    "property.type": "Senior"
}

In the stage’s specification, we can see that we are matching against the "Senior" property type, meaning only documents with a value of "Senior" for the field "property.type" will be passed onto the (yet to be created) next stage of the pipeline for further processing.

We can check the output of this and any other stage at any time by clicking ‘Show output from the selected stage‘. Similarly, we can see the input of any stage at any time by clicking ‘Show input to the selected stage‘. This is a really nice and convenient feature, as it makes keeping track of the precise form of the data we are working at each stage in the pipeline really easy.

We can see in the ‘Stage 1 output‘ tab that we have the results we need from this stage, and so let’s go on and create the next.

Grouping Results

We now need a way to group together the results from Stage 1 on zip code and then add up each of the available units figures. The ‘$group‘ operator is exactly what we need for this.

Group Operator MongoDB Aggregation Query

The Stage 2 ‘$group‘ operator specification is repeated below:

{
    _id: "$zip_code",
    totalUnits: { $sum: "$units" }
}

The specification of Stage 2 states that the output of this stage will be documents that have an “_id" with a distinct zip code as a value and so will group together documents input to this stage that have the same zip code, and a “totalUnits" field whose value is the sum of all the "units" field values from each of the documents in the group. We can see the input to and output from tabs for this stage in the screenshot and can confirm that a reduction has taken place – of the 70 documents input to this stage, there were 36 distinct zip codes, and so the corresponding 36 documents are output from this stage.

Finding the Answer

As we want to know the zip codes that have the greatest number of senior housing units available, it would be convenient to sort the results from the greatest to the least total units available.

To do this, we’ll create a third stage using the ‘$sort‘ operator with the following specification, giving us exactly what we want:

{
    totalUnits: -1
}

Full Query and Results MongoDB Aggregation Query

Going back to the ‘Pipeline‘ tab we can see the result of the execution of the full query, as well as the full query itself, all in one one place. We can see we have the expected number of results from the full pipeline, and we can now answer our question – we have a list of the zip codes that have the greatest number of senior housing units available.

Wasn’t that easy? :-)

Specifying Query Options

Depending on your own particular query, you may wish to specify options such as to use a database cursor for the results (if the results are large), allow the query to write temporary intermediate results to disk or rather than run the query, explain aspects of the processing of the query.

These options can be set in the ‘Options‘ tab. Note that these options only became available in MongoDB 2.6, so if you are connected to a MongoDB 2.4 or earlier instance, the ‘Options‘ tab is not shown.

Options MongoDB Aggregation Query

Sharing Aggregation Queries

The aggregation queries you have created can be saved to and loaded from file, so not only can you reload them in future sessions, but you can also share them with other colleagues and users.  There is also a preview of the raw MongoDB script of the aggregation query available by selecting ‘Show Query Preview‘ from the context menu. This can be rather handy if you simply wish to examine the raw underlying script, or make a quick copy to stick in an email, or perhaps combine in it in a larger, more complex query in 3T MongoChef’s IntelliShell.

Query Preview MongoDB Aggregation Query

Handy References

It can take a bit of time to master all the different operators available in the aggregation pipeline, so links to the MongoDB Aggregation Pipeline Quick Reference and the Aggregation Section of the MongoDB Manual are always available within a click’s reach directly in the app itself via the ‘Operator Quick Reference‘ and ‘Aggregation Tutorial‘ links, respectively. It wont be too long before you’re masterfully producing complex MongoDB aggregation queries of your own!

 

OK, that’s it for this post. I hope you feel the same delight as we do about the amazing new features, convenience and boost to productivity the new Aggregation Screen in 3T MongoChef offers.

Please do check out 3T MongoChef, the best GUI for MongoDB. A little example of the rich code completion and easy in-line editing experience it offers is shown below:

IntelliShell for MongoDB

We’re always very keen to hear about your experiences and ideas for 3T MongoChef. If you’d like to tell us about them please visit our feedback page or click the ‘Feedback‘ tool bar button in the app.

Also, please check out our Schema Explorer & Documentation and Data Compare & Sync tools at 3T.io, as I expect they’ll also help in making you a much more powerful and productive MongoDB user.

0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *