AllSeq's Knowledge Hub for the Next Generation of Sequencing

By Allison Proffitt

December 12, 2013 | By the time Danish company BlueSEQ had finished its market research a few things were clear: the sequencing industry needed more than just provider listings, there was a huge opportunity for a user-friendly matching service, and it needed to be based in California.

The new entity, AllSeq, opened offices in San Diego this April after having purchased BlueSEQ’s intellectual property—a software platform for connecting the right researchers to the right sequencing providers. The company isn’t revealing their funding sources, but would say that investors include some BlueSEQ investors and some new ones. AllSeq headcount stands at 5 full time employees and a handful of part-time staff.

Shawn Baker was involved with BlueSEQ before he took CSO reigns at AllSeq. When Bio-IT World first sat down with Baker, it was over lunch at the 2013 Consumer Genetics Conference. We discussed the commoditization of sequencing and what it should look like to order sequencing online. Baker knows the industry and is committed to knowing and meeting the needs of researchers and providers. I made sure the next time we sat down, I had a recorder in hand.

Bio-IT World: Shawn, I’ve heard the overview, but can you tell me about AllSeq again so I can actually quote you on it?

Shawn Baker: Sure! What we’ve done is we’ve created the sequencing marketplace, which is where we match researchers who need sequencing with the right providers who have the capacity for it. So we provide tools for the researchers to go on the site and describe their sequencing projects in a really uniform way, capturing all the information that they need to know and all the information that the providers would need to know, in order to give them accurate estimates.

We consider technology platforms and geographical constraints. For example, they may only want their restricted projects performed in their home country or they may not care, it could be seen worldwide. And of course, they may have some sort of time constraints. Based on all these factors, we figure out which of the providers that we have are potential matches, and give those providers access to the project. Providers can then log on, and decide if those projects match their current workflow, if they have the capacity, if it’s an application type that they’re currently working on or would fit in with the other projects they’re doing. They can offer researchers estimates, just to say, I see what you’re doing, it’s going to cost probably about this much, but if you want to modify it a little bit, maybe sequence a little deeper, sequence a little less, it will cost this other amount. Then a researcher can then look at all those estimates from all the interested providers, and quickly tell them which ones that they actually would like to move forward on: “I am interested in this; I’d like you to take it to the next step. Let’s make that an actual quote, actual binding quote that if we sign it, it’s a done deal.”

Are you picking who you think is the best fit? What are you basing those decisions on?

We’re actually not so much picking the best fit, we’re just excluding people who we know wouldn’t be a good fit. And it’s really objective criteria. We’re not making a subjective call. For example, we might have someone who has funds that can only be spent at a US academic lab and so they’ll say, I just want to see US academic labs bid on this. And so we would exclude, obviously, all non-US places and all US commercial labs. Sometimes people might say, it could be anywhere in Europe, but it can’t be the Americas. And then a lot of people just say yeah, anywhere. So those are the kind of things that we’re just filtering things out to prevent people from wasting their time. We don’t want users to see bids that they know they actually can’t act on and we don’t want to show providers projects that we know they can’t bid on. So that’s really what we’re trying to do is just streamline it and make it more efficient for everyone involved.

Can you give me some metrics for your platform? Where your providers are, where your researchers are coming from?

We need to have providers that cover the globe and have access to all the various platforms. So we really work hard to try to make sure we have that covered. Some are easier than others. We’ve got a lot of Illumina providers. It’s a little bit harder to find Ion Torrent providers. They’ve certainly out there, but there don’t seem to be quite as many. And probably the hardest one is PacBio. There aren’t that many providers worldwide, so we’re really trying to make sure we have those guys covered.

In terms of the researchers who are using the system, I would say very roughly half would be from the Americas, maybe 30% from Europe and 20% from Asia. We have probably more heavily skewed toward academics, but we certainly have many commercial people looking to use the system as well.

Researchers go to the site whenever they have new work, to ask for information. Labs or providers, are they getting emails from you? Do they log in and have a dashboard, how does that work for them?

Yes, most of those. If they log in, they will see a dashboard, they’ll see all the projects that they can bid on, the ones they have bid on, kind of the status of everything. We have communication tools in there that let people—the buyers and the providers—communicate back and forth about particular projects. They can see all that from the dashboard. In addition, the system does send out alert emails when new things happen. If a new project becomes available for a provider, we tell them about it, and if there’s new communication in the system, we tell them about it. It’s the same thing for the buyers, if there’s new communication on their project or a new estimate or a new quote, they can see that in their dashboard and the system will alert them.

Do researchers get to say this group was great or this group was less great?

Yeah, absolutely. We are still in the process of building all that and it’s a little bit manual right now… Once they’re completely done with their project, we’ll ask them, how did that go? What did you like? What did you not like? How would you rate that provider? And they can give us their subjective view. There’s also some fairly objective things that we can look at. We can look at how responsive those providers were. When a buyer puts in a comment or a question, how quickly does the provider get back to them? How closely to the stated turnaround time did they stick? Were they on time, were they late? So there’re all sorts of things like that that we have access to and we’re going to use that kind of information in an overall provider assessment.

The other thing we’re going to add to that is specialty information from the providers. Certainly there are some providers who are just trying to get the enterprise. They’re just trying to be a commodity: we can do this cheaper or we can do it quicker, or whatever. But we’re finding a lot of them are actually trying to specialize. Sometimes it’s on data analysis, sometimes it’s on what kinds of specialty applications that they are used to handling. And we’re going to give them a venue within the system to talk about that. When a buyer sees a provider has put in an estimate or a quote, they’ll be able to click on their profile and see any past reviews, see the internal system review of how they’ve been doing, but then also any kind of specialty information that the provider wants to show there, any kind of a special expertise they might have that would sway the buyer one way or another.

What’s the cost model, the business model?

We wanted to make it easy for everyone to do this, because we think it’s really useful and really important for them. So it’s actually free for everyone to sign up and to do everything that the system offers, up to and until the very final step. What that means is, only after a match has been made between a buyer and a provider and accepted—and they both agree to it—then we take a small fee from the provider based on the size of that project. But up until then, researchers can log in, they can put as many projects in as they want, they can use all the online tools that we have for free. The providers can sign up to the marketplace for free, they can see all the potential projects, they can put estimates in, they can bid on them, they can communicate with the buyers, they can do all those things. It’s only when they know they’re getting paid do we get paid.

At any point in this is there any blinding, or does everyone know immediately who has the job and who’s bidding?

No, we don’t really hide anything. Once a provider puts in an estimate or a bid or even poses a question, maybe for clarification, we tell – in the system, everyone knows exactly who everyone is. So this isn’t like Provider A, Provider B. It will actually say who that is.

How do you keep them within your system so that you know when you should get paid?

Well, that’s something that a number of people have asked us and in theory, it could be a problem. People, of course, could try to cheat the system and there are things you can try to do to make sure that doesn’t happen. But we think the most important thing to do is—and what we think we’re doing—is make the tools so useful that both parties here actually want to use it.

We are unique in this space in that we’re the only ones who are really trying to actively solve the problems that people have. It’s not easy for researchers to find the right provider and it could be difficult for providers to find the right researchers who actually have the projects that they can work on. So the better we make the tools, the easier we make it, we just basically make them want to use the system and to make it easier than what they were doing outside of the system. So as long as we keep doing that, we don’t think there’s really going to be a strong desire for them to try to cheat the system.

Okay. So that’s the base offering, but you’ve told me you’ve got much bigger plans than that. What else is on the horizon?

We really set up AllSeq, as I said, to try to solve some problems. One of the problems that we already talked about was literally getting researchers who have sequencing projects finding the right provider and the provider finding the right researchers. But the other problem that we’ve seen is that it can be really difficult to get trustworthy, neutral information on this really fast-moving field. So there’s a number of platform providers, vendors and they all have their own websites and they all talk about what they’re doing, how great their systems are and how powerful they are and what they can do. And they might position themselves versus the other platforms—we do this better than the other guys do—but they’re essentially never going to give you that truly neutral look.

That’s where we step in. We take an objective look at all these platforms, we assess them in terms of what they should be able to do, based on the stated specs, we monitor what people are doing in the field, we pay attention to the conferences, to the blogs, to the trades and really kind of collect all that together to try to figure out which platforms are best for which applications. And then we let the community know about it. That’s our next-generation sequencing knowledge bank. And again, that’s free for everyone to use.

One of the really obvious things to start looking at, and we’ve already started doing this, is looking at bioinformatics space. So right now we’re concentrating on helping people figure out the right platforms that they might need and the right providers who have those platforms, but of course, once you generate the data, you need to analyze it. So we’ve got some basic bioinformatics. I don’t want to say we have the offerings, but we help people find bioinformatics resources, but in a relatively rudimentary way.

The other big space, of course, that everyone’s interested in is the clinical space. The clinical market is really looking at sequencing and trying to figure out how they can use it. We definitely want to be a part of that. I don’t think it’s going to look the same, so maybe our underpinnings that we use to run the system will be the same, but how clinicians interact with us will probably be pretty different from how researchers interact with us. And so that’s something we’re always keeping an eye on to try and figure out when is this market really taking off and what problems do they have that we think we can solve?

You mentioned that AllSeq is working on a partnership with GCAT, Genome Comparison and Analytic Testing, a free community driven platform for evaluating the performance of NGS data analysis methods now owned by Gene By Gene. How will that add to AllSeq’s offerings?

We see the AllSeq mission and the GCAT mission as highly complementary with almost no overlap. As such, a partnership between the two of us seemed ideal. We're starting out simple (for example, by making sure that AllSeq users are aware of the GCAT tools), but we have plans for a much tighter integration in the future. Deciding on the proper data analysis methods is one of the hardest steps in an NGS project, and we feel the GCAT tools can really help with that.

We would like to, in the future, have a deeper integration and basically use the tools that GCAT has on the AllSeq site to help our researchers figure out what they need to do. So for example, they might be putting a project together for exome sequencing in human and one of the things we’re going to ask them is, how much sequencing do you need, what kind of coverage? Do you want 30X, 50X, 100X, 1000X? And sometimes people know and sometimes they don’t. Sometimes these are pretty brand-new users who haven’t done this before and they need a little bit of help. We try to give them that help, but with the GCAT tools, we can do it in a more objective way. We can say, “Well, tell you what, here are some standardized datasets that many, many people have analyzed. You could see what that data would look like at those different coverage levels, 30X, 50X, 100X. Maybe the difference between 30X and 60X is really dramatic, but 60X and 120X, you start seeing diminishing returns.”

Because it’s so easy to do on our system, they could even ask for estimates for both and they might find that the 120X is better, but it’s costing so much more because it’s kind of an unusual thing to ask for. They can decide that, “Well, maybe that’s not so important, I’ll just drop back down to the 60X or 50X.” Those are the kind of things that we think with a deeper partnership with GCAT, we’re going to be able to really offer our users some tools that would be really helpful.