New Year, New Adventures

Just over two years ago, I embarked upon a journey as a developer / evangelist for a company who was then called 10gen (who got fed up of saying "the MongoDB people", and transformed into MongoDB Inc). My goals for this role were: to learn what it was like working for a company that produced a technology product; to discover what impact working in an open source fashion has; and to level up my advocacy skills. I have met all these goals, and more - I met some fantastic people; learnt different approaches to software development; discovered my new favourite database for creating applications; moved to Spain; started both a MUG and a JUG; worked to understand the value of community and evangelism, and to help create a strategy for these areas; and my evangelism efforts and open source work earned me the Java Champion title. I'm extremely proud of what I've achieved over this period, and very grateful to MongoDB for giving me these opportunities.

But now, a new adventure is about to begin. If you've seen my live coding demo this year, you'll know of my love affair with IntelliJ IDEA, a tool I use daily (even for blogging). Well, now I'm joining the team at JetBrains, where I'm going Full Advocate. I hope this means I get to carry on doing more of what I love - presenting, writing, and working on demos to help developers become more productive. I hope this will give me opportunities to stay ahead of the curve in the Java/JVM world.

And yes, in answer to the Most Frequently Asked Question, I am staying in Spain. I've fallen in love with Sevilla and I'm not ready to leave yet.

I shall leave you with my somewhat disasterous "Top Ten IntelliJ Tips" from GOTO Aarhus, which is worth watching just to see Dan North save me from the curse of the live demo. Things can only get better from here, right?

MongoDB London

<em>TL;DR MongoDB London, 6th November, 50% off with discount code 50Trisha.

So, MongoDB London is nearly upon us again, and I’m dead disappointed I can’t make it this year (I’m [keynoting at GOTO Berlin] (http://gotocon.com/berlin-2014/presentation/Party%20Keynote:%20Staying%20Ahead%20of%20the%20Curve) instead, which I’m terrified, I mean, excited, about). The last MongoDB London was really interesting for me - I was fairly new to the company, I’d been here less than six months, and it was a really great way to go both broad and deep on the technology.

I was trying to find a blog post where I talked about my experiences, but it looks like I only wrote that In My Mind. In fact, I only only wrote two whole lines of notes on the conference. But 18 months later I clearly remember presentations from my colleagues Ross and Derick demonstrating the geo capabilities in MongoDB, presentations which heavily influenced the live coding demo I’ve been giving recently.

I also remember Eliot’s presentation - Eliot is one of those people who totally gets away with breaking the “speak in a measured, clear fashion” presenters rule, he fires fascinating information at you at high speed and it’s actually one of the compelling things about his talks. If you ever get a chance to see him talk about the product, it’s totally worth it.

Other than that, the most awesome thing about the conference was the chance to meet, and talk to, a bunch of different MongoDB people - there are the engineers who work on the product (like me and my colleagues); the people leading the way with the technology, like Eliot; and finally, but for me most importantly, you get to meet really interesting people who are using MongoDB in ways that you might not even imagine.

Anyway the point of this sales pitch is, whether you’re using MongoDB already or you’re keen to find out about it, MongoDB London is only going to take a day from your life, and you’ll learn a bunch of interesting things. And, with a Special Discount Code from me, it’s only £45! Sign up with code 50Trisha.

Using Groovy to import XML into MongoDB

This year I've been demonstrating how easy it is to create modern web apps using AngularJS, Java and MongoDB. I also use Groovy during this demo to do the sorts of things Groovy is really good at - writing descriptive tests, and creating scripts.

Due to the time pressures in the demo, I never really get a chance to go into the details of the script I use, so the aim of this long-overdue blog post is to go over this Groovy script in a bit more detail.

Firstly I want to clarify that this is not my original work - I stole borrowed most of the ideas for the demo from my colleague Ross Lawley. In this blog post he goes into detail of how he built up an application that finds the most popular pub names in the UK. There's a section in there where he talks about downloading the open street map data and using python to convert the XML into something more MongoDB-friendly - it's this process that I basically stole, re-worked for coffee shops, and re-wrote for the JVM.

I'm assuming if you've worked with Java for any period of time, there has come a moment where you needed to use it to parse XML. Since my demo is supposed to be all about how easy it is to work with Java, I did not want to do this. When I wrote the demo I wasn't really all that familiar with Groovy, but what I did know was that it has built in support for parsing and manipulating XML, which is exactly what I wanted to do. In addition, creating Maps (the data structures, not the geographical ones) with Groovy is really easy, and this is effectively what we need to insert into MongoDB.

Goal of the Script

  • Parse an XML file containing open street map data of all coffee shops.
  • Extract latitude and longitude XML attributes and transform into MongoDB GeoJSON.
  • Perform some basic validation on the coffee shop data from the XML.
  • Insert into MongoDB.
  • Make sure MongoDB knows this contains query-able geolocation data.

The script is PopulateDatabase.groovy, that link will take you to the version I presented at JavaOne:

PopulateDatabase.groovy

Firstly, we need data

I used the same service Ross used in his blog post to obtain the XML file containing "all" coffee shops around the world. Now, the open street map data is somewhat... raw and unstructured (which is why MongoDB is such a great tool for storing it), so I'm not sure I really have all the coffee shops, but I obtained enough data for an interesting demo using

http://www.overpass-api.de/api/xapi?*[amenity=cafe][cuisine=coffee_shop]

The resulting XML file is in the github project, but if you try this yourself you might (in fact, probably will) get different results.

Each XML record looks something like:

<node id="178821166" lat="40.4167226" lon="-3.7069112">
    <tag k="amenity" v="cafe"/>
    <tag k="cuisine" v="coffee_shop"/>
    <tag k="name" v="Chocolatería San Ginés"/>
    <tag k="wheelchair" v="limited"/>
    <tag k="wikipedia" v="es:Chocolatería San Ginés"/>
</node>

Each coffee shop has a unique identifier and a latitude and longitude as attributes of a node element. Within this node is a series of tag elements, all with k and v attributes. Each coffee shop has a varying number of these attributes, and they are not consistent from shop to shop (other than amenity and cuisine which we used to select this data).

Initialisation

Script Initialisation

Before doing anything else we want to prepare the database. The assumption of this script is that either the collection we want to store the coffee shops in is empty, or full of stale data. So we're going to use the MongoDB Java Driver to get the collection that we're interested in, and then drop it.

There's two interesting things to note here:

  • This Groovy script is simply using the basic Java driver. Groovy can talk quite happily to vanilla Java, it doesn't need to use a Groovy library. There are Groovy-specific libraries for talking to MongoDB (e.g. the MongoDB GORM Plugin), but the Java driver works perfectly well.
  • You don't need to create databases or collections (collections are a bit like tables, but less structured) explicitly in MongoDB. You simply use the database and collection you're interested in, and if it doesn't already exist, the server will create them for you.

In this example, we're just using the default constructor for the MongoClient, the class that represents the connection to the database server(s). This default is localhost:27017, which is where I happen to be running the database. However you can specify your own address and port - for more details on this see Getting Started With MongoDB and Java.

Turn the XML into something MongoDB-shaped

Parse & Transform XML

So next we're going to use Groovy's XmlSlurper to read the open street map XML data that we talked about earlier. To iterate over every node we use: xmlSlurper.node.each. For those of you who are new to Groovy or new to Java 8, you might notice this is using a closure to define the behaviour to apply for every "node" element in the XML.

Create GeoJSON

Create GeoJSON Since MongoDB documents are effectively just maps of key-value pairs, we're going to create a Map coffeeShop that contains the document structure that represents the coffee shop that we want to save into the database. Firstly, we initialise this map with the attributes of the node. Remember these attributes are something like:

<node id="18464077" lat="-33.8911183" lon="151.1958773">

We're going to save the ID as a value for a new field called openStreetMapId. We need to do something a bit more complicated with the latitude and longitude, since we need to store them as GeoJSON, which looks something like:

{ 'location' : { 'coordinates': [<longitude>, <latitude>],
                 'type'       : 'Point' } }

In lines 12-14 you can see that we create a Map that looks like the GeoJSON, pulling the lat and lon attributes into the appropriate places.

Insert Remaining Fields

Insert Remaining Fields

Validate Field Name

Now for every tag element in the XML, we get the k attribute and check if it's a valid field name for MongoDB (it won't let us insert fields with a dot in, and we don't want to override our carefully constructed location field). If so we simply add this key as the field and its the matching v attribute as the value into the map. This effectively copies the OpenStreetMap key/value data into key/value pairs in the MongoDB document so we don't lose any data, but we also don't do anything particularly interesting to transform it.

Save Into MongoDB

Save Into MongoDB

Finally, once we've created a simple coffeeShop Map representing the document we want to save into MongoDB, we insert it into MongoDB if the map has a field called name. We could have checked this when we were reading the XML and putting it into the map, but it's actually much easier just to use the pretty Groovy syntax to check for a key called name in coffeeShop.

When we want to insert the Map we need to turn this into a BasicDBObject, the Java Driver's document type, but this is easily done by calling the constructor that takes a Map. Alternatively, there's a Groovy syntax which would effectively do the same thing, which you might prefer:

collection.insert(coffeeShop as BasicDBObject)

Tell MongoDB that we want to perform Geo queries on this data

Add Geo Index

Because we're going to do a nearSphere query on this data, we need to add a "2dsphere" index on our location field. We created the location field as GeoJSON, so all we need to do is call createIndex for this field.

Conclusion

So that's it! Groovy is a nice tool for this sort of script-y thing - not only is it a scripting language, but its built-in support for XML, really nice Map syntax and support for closures makes it the perfect tool for iterating over XML data and transforming it into something that can be inserted into a MongoDB collection.

Getting Started with MongoDB and Java

We've been missing an introduction to using MongoDB from Java for a little while now - there's plenty of information in the documentation, but we were lacking a step-by-step guide to getting started as a Java developer.

I sought to rectify this with a couple of blog posts for the MongoDB official blog: the first, an introduction to using MongoDB from Java, including a non-comprehensive list of some of the libraries you can use; the second, an introductory guide to simple CRUD operations using the Java driver:

Continue reading "Getting Started with MongoDB and Java"

Sevilla MUG March Madness

Last night the Sevilla MUG had our March Madness event. This was our largest event yet, with 36 people signed up. Although the aim of March Madness is to have a MongoDB Engineer at all the user groups this month, that's not such a big deal for us as I live here, so this was also the first event where I wasn't the main attraction - [Javier] (https://twitter.com/JvrBaena) gave a really great talk about the lessons learnt at SocialBro after using MongoDB in production for the last couple of years.

The slides for my introduction to Replica Sets and Sharding:

And the slides for the main attraction:

The event went really well, everyone seemed engaged, and our first talk in Spanish seemed to encourage more questions than normal. We were also in a new venue, and although I love the central location of our previous venue and the friendliness of the owners, the menu of the new location seemed to be a massive win.