ARTICLE

Opening Up Data Science with data.world

by | Thu 18 Aug 2016

Earlier this year when I was in Austin, my friend Andy Sernovitz introduced me to a new startup called data.world.

What caught my interest is that they are building a platform to make data science and discovery easier, more accessible, and more collaborative. I love these kinds of big juicy challenges!

Recently I signed them up as a client to help them build their community, and I want to share a few words about why I think they are important, not just for data science fans, but from a wider scientific discovery perspective.

Screen Shot 2016-08-15 at 3.35.31 AM

Armchair Discovery

Data plays a critical role in the world. Buried in rows and rows of seemingly flat content are patterns, trends, and discoveries that can help us to learn, explore new ideas, and work more effectively.

The work that leads to these discoveries is often bringing together different data sets to explore and reach new conclusions. As an example, traffic accident data for a single town is interesting, but when we combine it with data sets for national/international traffic accidents, insurance claims, drink driving, and more, we can often find patterns that can help us to influence and encourage new behavior and technology.

Screen Shot 2016-08-15 at 3.36.10 AM

Many of these discoveries are hiding in plain sight. Sadly, while talented data scientists are able to pull together these different data sets, it is often hard and laborious work. Surely if we make this work easier, more accessible, consistent, and available to all we can speed up innovation and discovery?

Exactly.

As history has taught us, the right mixture of access, tooling, and community can have a tremendous impact. We have seen examples of this in open source (e.g. GitLab / GitHub), funding (e.g. Kickstarter / Indiegogo), and security (e.g. HackerOne).

data.world are doing this for data.

Data Science is Tough

There are four key areas where I think data.world can make a potent impact:

  1. Access – while there is lots of data in the world, access is inconsistent. Data is often spread across different sites, formats, and accessible to different people. We can bring this data together into a consistent platform, available to everyone.
  2. Preparation – much of the work data scientists perform is learning and prepping datasets for use. This work should be simplified, done once, and then shared with everyone, as opposed to being performed by each person who consumes the data.
  3. Collaboration – a lot of data science is fairly ad-hoc in how people work together. In much the same way open source has helped create common approaches for code, there is potential to do the same with data.
  4. Community – there is a great opportunity to build a diverse global community, not just of data scientists, but also organizations, charities, activists, and armchair sleuths who, armed with the right tools and expertise, could make many meaningful discoveries.

This is what data.world is building and I find the combination of access, platform, and network effects of data and community particularly exciting.

Unlocking Curiosity

If we look at the most profound impacts technology has had in recent years it is in bubbling people’s curiosity and creativity to the surface.

When we build community-based platforms that tap into this curiosity and creativity, we generate new ideas and approaches. New ideas and approaches then become the foundation for changing how the world thinks and operates.

screencapture-data-world-1471257465804

As one such example, open source tapped the curiosity and creativity of developers to produce a rich patchwork of software and tooling, but more importantly, a culture of openness and collaboration. While it is easy to see the software as the primary outcome, the impact of open source has been much deeper and impacted skills, education, career opportunities, business, collaboration, and more.

Enabling the same curiosity and creativity with the wealth of data we have in the world is going to be an exciting journey. Stay tuned.

An invitation-only accelerator that develops industry-leading community engagement and growth via personalized training, coaching, and accountability...all tailored to your company's needs.

Want to read some more?

Boost Online Community Growth with the Bucket Strategy

Boost Online Community Growth with the Bucket Strategy

Are you a community manager, community advocate, or developer relations (Dev Rel) professional struggling to come up with creative social media ideas? Effective community management involves consistently generating engaging social media content, but with a million...

Community Strategy & Management with CRM Tools

Community Strategy & Management with CRM Tools

I once sat down with a fellow community manager who told me, "The tools you use can make or break your community strategy." And she was spot on. Community management isn't just a buzzword; it's an art form that requires the right blend of technology, strategy, and...

Decoding Community Metrics: Data-Driven Growth Strategies

Decoding Community Metrics: Data-Driven Growth Strategies

In the bustling tech landscape, where buzzwords flutter like a swarm of bees, a few terms stand out not just for their buzz but for their genuine impact: "Community Metrics" tops that list. But why zero in on these metrics? They're the compass that guides your...

Online & Open Source Community Management: Simplified Strategies

Online & Open Source Community Management: Simplified Strategies

Ever caught yourself overwhelmed by a seemingly insurmountable pile of tasks? You're not alone. In the demanding worlds of DevRel and open source community management, stress and the nagging doubts of imposter syndrome can frequently surface. However, it's crucial to...