Statistics for Data Science
上QQ阅读APP看书,第一时间看更新

Adding to your personal toolbox

In my experience, most data developers tend to lock on to a technology or tool based upon a variety of factors (some of which we mentioned earlier in this chapter) becoming increasingly familiar with and (hopefully) more proficient with the product, tool, or technology—even the continuously released newer versions. One might suspect that (and probably would be correct) the more the developer uses the tool, the higher the skill level that he or she establishes. Data scientists, however, seem to lock onto methodologies, practices, or concepts more than the actual tools and technologies they use to implement them.

This turning of focus (from to tool to technique) changes one's mindset to the idea of thinking what tool best serves my objective rather than how this tool serves my objective.

The more tools you are exposed to, the broader your thinking will become a developer or data scientist. The open source community provides outstanding tools you can download, learn, and use freely. One should adopt a mindset of what's next or new to learn, even if it's in an attempt to compare features and functions of a new tool to your preferred tool. We'll talk more about this in the perpetual learning section of this chapter.

An exciting example of a currently popular data developer or data enabling tool is MarkLogic (http://www.marklogic.com/). This is an operational and transactional enterprise NoSQL database that is designed to integrate, store, manage, and search more data than ever before. MarkLogic received the 2017 DAVIES Award for best Data Development Tools. R and Python seem to be at the top as options for the data scientists.

It would not be appropriate to end this section without the mention of IBM Watson Analytics ( https://www.ibm.com/watson/), currently transforming the way the industry thinks about statistical or cognitive thinking.