Succeeding in Data Science — Chapter 2 Excellence

<<Chapter 1 Passion

In his memoir, Matthew McConaughey speaks about a moment in his life when he was to tell his father that he wants to switch from law school to acting. He expected his intolerant father to skin him alive.

But his father listened to him, paused, and said in affirmation, “Son, don’t half-ass it!” This stuck with Matthew. And, now he is one of the most prominent actors.

Similarly, if you decided to get in data science, don’t half-ass it. Aspire for excellence!

Excellence. Some adages are dramatic about it. For example, “either aspire for excellence or go home.” Some are preachy. For example, “ excellence is doing ordinary things extraordinarily well.”

All of these are as hollow to me as “follow your passion” was. These are like sermons. Although correct, they are usually unclear to act upon.

As a data scientist, I look for patterns to draw a path to follow. The entry gate was passion identification in Chapter 1. Now, I will lay a path to excellence in data science (Figure 1).

Figure 1. Excellence Path.

The path has four stations: theory, development, communication, and leadership. The details about what they are and how to develop them are in the subsequent chapters. Here we will learn the motivation behind them.

Theory

There is a saying, “an uninitiated has several solutions but an expert has only a few.” This is true for data scientists.

A good data scientist does not know what is likely to work but knows what is unlikely to work. Filtering the unlikely helps save time and effort. More in less time is, of course, an expectation from an expert.

This knowledge to distinguish likely from the unlikely comes from the theoretical understanding. The understanding of how data science methods work, their rules, and how to break them?

This level of understanding is the first step towards developing a solution. Chapter 3 goes over the question: what and how to learn the theory?

Besides, I used the word “likely.” Why? Because as a data scientist we can only have a hypothesis. We cannot be certain. I will never say, “this will work.” I say, “this is likely or unlikely to work.”

This is my response whenever I am enquired about my approach. I do not use this nuanced jargon to evade from any commitment. Instead, it is to remind myself that I am a researcher, an implementer, and not an oracle.

I must keep my mind open to alternatives. I must be prepared to switch to an alternative when appropriate instead of attaching my ego with one approach. Attaching ego is more common than we think. Refrain from it!

Data science is a technical field. It is founded on the principles of Mathematics, Statistics, and Computer science. It requires a thorough understanding of some relevant topics within each. Chapter 3 goes deeper into this.

Lastly, do not force yourself to look like an expert. Take the journey to go from beginner to expert. It is okay to be a beginner. We all started from here.

Take the journey to go from beginner to expert. It is okay to be a beginner. We all started from here.

Development

You cannot make someone successfully implement your idea if you cannot do it yourself!

There was a time when the data science profession was new. The data scientists at the time were scarce and mostly came from research background. To some of us, development was menial.

We would put ourselves on a pedestal from where only data science modeling looked elegant, done by only a few capable souls like us; but its development was beneath us.

Devoid of development skills, we would be paired with an engineer. It failed miserably!

Many data science projects spilled all over the place. Money invested in them lost. It caused skepticism in some corners about the data science applicability.

As a data scientist, we owe it to everyone who puts faith in us to come down of our pedestal. Give an equal importance to development. Remind yourself: it is called research and development.

Importantly, the development should not be restricted to only prototyping. There is a long road from a prototype to a commercial product, also called as productization.

This road becomes significantly longer if you hand over the development to someone else. Even impassable sometimes.

It is, therefore, essential to learn development to take your solution from prototype to product.

Take your solution from prototype to product.

Obviously, you’ll move up the ladder and become a head of data science team. You will have a team for development. Still you’ll only succeed if you could develop it yourself. Only then you can plan, review, and help your team for a successful development.

You can only make your team build a solution if you can develop it yourself.

Communication

It is all for naught if not communicated.

Albert Einstein was one of the best communicators. One of his famous publications was on Special Features of The New York Times in 1929. In his article, “Einstein Explains His New Discoveries,” he wrote about his latest contribution for gravitation, electro-magnetism, and ideas of time and space.

Figure 2. Einstein Explains His New Discoveries. Source: The New York Times.

The article became so popular that the departmental stores in London posted the newspaper on their window for the passers-by to read. Large crowds gathered around to read it everyday!

Similarly, Stephen Hawking is popular for his book, A Brief History of Time, where he explains the cosmos.

Why are both of them and their work famous? Because they took the time and effort to communicate in simple words. Simple enough to be understood by a layman.

Unfortunately, like development, communication skill is sometimes considered trivial or unnecessary. Some of us have a mindset that results speak for themselves.

No, they don’t! You have to tell the story every time.

Results do not speak for themselves. You have to tell the story.

Some of us also believe that preparing a report, article, or presentation is not the best use of our time. After all, we are smart scientists. Our time is best invested in research.

Wrong!

Communicating, whether written or verbal, helps us as much as our listener. It unravels our idea. It improves our own understanding of the problem. This prevents us from unintentionally drifting away from the actual problem. Sometimes we also discover problems with our idea.

Basically, we cogitate while communicating. In doing so, our thoughts become clearer. We get new ideas from within. Importantly, we also open the gate for ideas to come from our listeners.

Above all, communication makes you modest. You overcome your ego. You’ll discover issues in your solutions. You’ll learn that understanding the problem, and arriving at its solution is a process. And, that results cannot speak unless you do!

Leadership

Leadership is often confused as leading a team. This is incorrect!

It is about owning and getting things done. In any position, whether an individual contributor, senior, or lead, you must own your task. And, get it done.

Leadership means owning and getting it done!

Speak with whomever required. Collaborate with anyone needed. Follow-up when you must. Remember it is upon you to make it work and get it done!

It is easy to find reasons for a project to delay, not deliver, or fail. Server can crash. An Engineer’s laptop gets stolen. A data scientist leaves for a vacation. There could be numerous reason. But you still must deliver quality product on time.

This seems unreal. We cannot control so many unknowns. How can we still commit to quality and delivery on time? This is the mentality we must change.

I respect time. Now I am getting better at arriving by a committed time. But this wasn’t the case a few years ago. Even then I respected time but for one reason or another I’ll get late. Once I was invited to a friend’s birthday party in the evening. I took a resolve to be on time. I came back home early, showered, and got ready by 5 pm for the party at 7 pm. Or so I thought. I was dilly-dallying at home when I got a text around 6:30, “where are you?” I was late again for the party at 6 pm. I told myself that this time it wasn’t my fault. I was ready. It was just some miscommunication. But the reality is that I could have double-checked the time.

This changed my mentality. I started to look for potential issues ahead. And, plan for them. This is the point. You must be prepared for any last minute or intermediate problems.

For example, have a backup server. Create a robust code and data management system so that everything is synced to a remote cloud to remove dependence on individual computers. Do not allow a situation where one data scientist becomes a bottleneck that his absence halts or slows a project. Create a work environment where people collaborate and work as a team. This unclogs resource bottlenecks.

Foresee problems to prepare ahead.

The point is that with leadership skills you can foresee potential problems. You prepare ahead. You learn that you cannot stop problems from happening. But you can be prepared to be minimally affected.

Excellence is an endless journey. The Excellence Path is, therefore, circular. Each of its four stations — theory, development, communication, and leadership — has so much to learn. Learning requires planning. That is why plan is at the center of the path.

Coming next are, what to learn, how to learn, and how to plan?

>>Chapter 3 Theory

Disclaimer: The post contains original content copyrighted to the author.

Director of Science at ProcessMiner | Book Author | www.understandingdeeplearning.com