Funded Startup Seeks Lead Engineer for Data Mining of Code
We are a fully-funded startup building a web-based service for analyzing and searching code. You will be in charge of building a large, distributed computing system for parsing and indexing code for our search engine. We've been creating the code analysis tools for existing customers, and we need you to build the automation framework to help us scale.
We're solving problems that will have a major impact on the software industry. For example, one goal is that independent, open-source authors have a way to get paid for their work. This is not yet another soul-sucking project to mine social networks to create ads!
You'll team up with us to implement solutions to some challenging problems. (But don't worry, you don't already need to be an expert at all of these).
- Nearest-neighbor search in high dimension (or knowing how to avoid this by projecting to a lower dimension)
- Classifying code by various metrics (structural flowgraph analysis, symbol sequences and frequency, etc.)
- Map/reduce deconstruction of complex queries
- Distributed computing design, cluster management, software deployment, load balancing
You'll be working with our founder, Nate Lawson, a security expert and lead engineer at several successful startups. He founded Root Labs in 2007 and grew it into a profitable consulting firm focused on designing and analyzing embedded systems and cryptography. Before that, he was an early employee of Cryptography Research, which was recently acquired for $340M. He was the lead engineer for the Blu-ray content protection system called BD+, which was acquired separately for $60M. Even farther back, he invented and built the first network intrusion detection system, ISS RealSecure.
- Python or Ruby
- Unix programming (Linux and/or FreeBSD)
- Working independently to solve problems, manage time, and be self-motivated
- MySQL or PostgreSQL
- Key/value stores, especially Riak and Redis
- Cluster management and Unix admin (Fabric and Chef)
- Compilers, linkers, and language toolchain internals in general
- Low-level computing (assembly language, linkers/loaders, compiler optimizer design, intermediate languages)
- Using the right algorithm and implementation for the right problem. Knowing how to do profiling and basic statistics to make that choice.
- Machine learning: clustering, classification (locality-sensitive hashing, SVMs)
- Distributed systems and fault-tolerant computing (BigTable, GFS, Dynamo, and similar designs)
- At least one assembly language (x86 preferred)
We're only interested in people who produce working code and deploy it. This is not a research position involving modeling and R. We're a fast-paced company -- if you run into a problem, it's often best to come up with a heuristic and continue around it. You don't have to implement program analysis tools yourself, but you'll be building tools in Python and C/C++ that analyze the data we've extracted from the code.
This position is full-time and onsite in Oakland, CA. We're not too far from BART and have a nice office. We would consider relocating an excellent candidate. However, you must be a US citizen or legal resident (sorry, we're still too small to handle visa issues).
- Competitive salary
- Stock options
- Health care
- Comfortable offices on Piedmont Ave, near great restaurants and shops
- Ground-floor timing and chance to shape technical direction of startup
Interested? Please email the following to:
- Code samples (extra credit if for relevant open-source or hobby projects)
We'll respond to you within a few days. After this, we'll send you a programming problem that you can do at a time of your choosing. You can probably solve it in an hour. Finally, we'll interview you on the phone and in-person. Don't worry, it's not as complicated as it sounds, and we won't waste your time.
Please only contact us if you are representing yourself. No recruiters. If you know someone who might fit this, feel free to tell your friends. Thank you.
Still want more?
You can read about us on our blog, including these popular posts: