Performing spatial queries in MySQL – Setup

I need to create a locations database for an app that I’m currently building and wanted to try out the features in MySQL. I’ve created a toy set of data mostly around London place names with one additional city of London-by-the-Sea in order to test closeness and Bounding Rectangles.

I’m sure I’m going to need to add more cities in order to do my tests, but here for posterity (and ease of retrieval) is my test fixture.

This is using MySQL 5.6 – there are additional spatial features in 5.7 but I don’t have it! More later when I’ll try doing something useful with the data

CREATE DATABASE places;

CREATE USER ‘george’@’localhost’ IDENTIFIED BY ‘george’;
GRANT ALL PRIVILEGES ON places.* TO ‘george’@’localhost’;

CREATE TABLE place
(
id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
location GEOMETRY NOT NULL,
name VARCHAR(50),
country VARCHAR(2),
state VARCHAR(30)
) ENGINE=InnoDB;

INSERT INTO place
(name,country,location,state)
VALUES(‘London’,’GB’,GeomFromText(‘POINT(51.5142 -0.0931)’),null);

INSERT INTO place
(name,country,location,state)
VALUES(‘London’,’CA’,GeomFromText(‘POINT(42.9833 -81.25)’),’ON’);

INSERT INTO place
(name,country,location,state)
VALUES(‘London’,’NG’,GeomFromText(‘POINT(5.7167 5.7833)’),null);

INSERT INTO place
(name,country,location,state)
VALUES(‘London’,’US’,GeomFromText(‘POINT(37.129 -84.0833)’),’KY’);
INSERT INTO place
(name,country,location,state)
VALUES(‘London’,’ZA’,GeomFromText(‘POINT(-24.3 30.5833)’),null);

INSERT INTO place
(name,country,location,state)
VALUES(‘London’,’VE’,GeomFromText(‘POINT(10.3639 -66.7333)’),null);

INSERT INTO place
(name,country,location,state)
VALUES(‘London’,’PH’,GeomFromText(‘POINT(16.0097 125.1294)’),null);

INSERT INTO place
(name,country,location,state)
VALUES(‘London’,’US’,GeomFromText(‘POINT(43.0478 -89.0129)’),’WI’);

INSERT INTO place
(name,country,location,state)
VALUES(‘London’,’US’,GeomFromText(‘POINT(36.4761 -119.4432)’),’CA’);

INSERT INTO place
(name,country,location,state)
VALUES(‘Brighton’,’GB’,GeomFromText(‘POINT(50.8333 -0.154)’),null);

SELECT name, country, AsText(location) FROM place;

IPL – 2015 Roster done

Okay, so I’ve managed to pull in all the players from the sanitised Wikipedia text files. It took a bit longer than I thought as I ended up finding a bug in the version of neo4j-ogm I was using which meant that Repository findByProperty(prop) were actually returning all items! It was fixed in the next version, but that version wasn’t included in Spring Data Neo4j yet!

I ended up forking and creating my own fixes. It involved a fair bit of work although definitely worthwhile as I got to understand how the libraries work under the hood. There’s still a couple of todos in existing tests I need to fix but I’ve put those aside just for the moment. The fix is in the neo4j-ogm part of this forked repo.

As ever, there were some oddities in the data. For some reason, Imran Tahir’s contract value is set to NA. Usually when prices aren’t displayed that’s a bad sign! Is he getting paid that much?! Mind you he had a pretty decent season. The data all gets pulled in now and conforms to the model below.

Next steps is to put in some transaction hooks and begin consuming lifecycle events. At present the only lifecycle events are creates.

ipl-roster

IPL and Graph Databases

Most if not all of my toy projects (have to) involve something I’m actually interested in – space, books, er, traffic (I did say most) – so it was only a matter of time before cricket joined the party.

I’ve started a couple of projects that I put up on to github. The first was to take this year’s (2015) IPL roster from Wikipedia and parse it to create a set of Cypher to input in to a local Neo4J datastore. The implementation is pretty crude and was done more as a data modelling exercise. What would be a good representation of the roster. The output is in the following repository on my github.

The next evolution was to remodel the data. For example, I realised that if I wanted a multi-year store as well as data about the fixtures and results I needed to model Franchises and Teams separately).

More importantly, I decided that parsing pages and generating Cypher wasn’t really the way to go. I have big plans for this data and as such need a platform where I can insert the appropriate event hooks to do follow-on processing on the data (more in later posts). So it was with a heavy heart that I decided to use first Spring Data for neo4j (a heavy heart because the POJO model means a lot of concessions from the neo4j approach) and then later Spring Data for the great Repository features and neo4j-ogm for the actual Object-Graph Mapping.

It’s still a work in progress and can be found on this repo in github. I’m pretty excited about the other ideas I have for it – if only work, household chores, and socialising  didn’t get in the way!

Regulex – where were you 15 years ago?!

I really like regular expressions, but they’re a bit like the piano. The more you practice the better you get, but if you don’t use them for a while you’ll soon get rusty.

Back in the day we had a product that required us to use a lot of regular expressions. We soon got proficient, but I’d be lying if I said it was a smooth ride all the time. So it’s with retrospective envy that I look at jex.im, a JavaScript regular expression visualizer. I’ve not tried it with anything particularly exotic but even now I can see me using it to er, check my regex.

Also I can’t write a post about Regular Expressions without mentioning one of the best technical books ever – Mastering Regular Expressions by Jeffrey Friedl. I’ve purged a lot of technical books and I don’t really use it that often, but that is one that still sits proudly on my shelf! It also reminds me of a family holiday to Sardinia where I took it with me to read!