Top tips for running a data science journal club
Similarly to modern web services, best practices constantly iterate and improve. Here at Lyst we’ve taken many of our favourite practices and tried to adopt them. As we’ve grown our team and started to follow the practices, we’ve been tweaking them to make them better suited based on how we work. We’ve also been asking new team members to share their previous experiences and opinions on what works well for various aspects in our team. This has been really good for us and we’ve been wondering how we could get more of this outside influence.
At Lyst we’ve been improving our testing environments over the last year or so, and one of the main elements we wanted to improve was our testing stack with Selenium. We’ve used Selenium in the past, but the tests grew old, were poorly maintained, and few people could work out how they worked after our shift to Docker (read more about that in a previous post.)
Unless you've been under a rock in the Twitter world for the last week - you will have seen the #ILookLikeAnEngineer hashtag. Here at Lyst, we have some brilliant engineers - many of whom are women. We decided we should tell you all a little bit more about ourselves, how we came to engineering, and what advice we have for women wanting to be engineers themselves.
Nearly half of the staff at Lyst are technical or have a technical background. We have a large technology stack and plenty of exciting projects that we’re working on. But we’re often so focused on developing great experiences that we don’t get the time to share what we’re doing with you.
Our engineering team is taking a short hop to mainland Europe this July to attend EuroPython 2015 in sunny Bilbao, Spain. We’ll be spending six days with fellow Pythonistas from all across Europe (and even the world!) and attending over 200 sessions, workshops, and social events.
Nearest neighbour search is a common task: given a query object represented as a point in some (often high-dimensional) space, we want to find other objects in that space that lie close to it. For example, a mapping application will perform a nearest neighbours search when we ask it for restaurants close to our location.
ICLR is a relatively new conference that is primarily concerned with deep learning and learned representations. The conference is into its third year and had over 300 attendees, two of which were from Lyst. In this post we’ll discuss a few of the interesting papers and themes presented this year.
For the past few years I have advocated best practices for building REST APIs and I spent a lot of time building reasonably well designed examples to help demonstrate it. I learned that building REST APIs from the ground up isn’t hard at all because you have no legacy or technical debt to work with, so of course everything is going to work well and be praised for being RESTful.
I've never really got into Arduinos, Raspberry Pis and the like, and haven't touched a breadboard for 12 years. Despite this, I won a hardware hackday at the recent AWS re:Invent conference.
Bayesian analysis of A/B tests is a great way of getting reliable inference. Except, of course, when we get our priors horribly wrong.
The OpenRoss image service provides a way of serving dynamically resized images from Amazon S3 in a way that is fast, efficient, and auto-scales with traffic.
We process millions of fashion products a day from over 500 retailers. One of the goals of the data-team is to transform this stream of semi-structured data into one consistent product catalogue. Colour is one of the most difficult fields to normalise. In this post we discuss how product colors are derived from product images.
We process millions of images using an ecosystem of classifiers. In order to get the most information out of an image, it is best to remove the background as it may contain data which will make the classifier less accurate. In this post we discuss methods of removing backgrounds from images.