Interview on Training Data for AI with Wilson Pang, CTO at Appen

Wilson explains the value of accurate training data for AI to ensure deployed applications are free of bias and ROI achieved

1. Tell us about yourself and how you came to be in your current role as the CTO at Appen?

I joined Appen in November 2018 as CTO and am responsible for the company’s products and technology. Before joining Appen, I was Chief Data Officer of CTrip in China, the second-largest online travel company in the world, where I led data engineers, analysts, data product managers, and scientists to improve user experience and increase operational efficiency that grew the business. Earlier, I was also the Senior Director of Engineering at eBay California and provided leadership to various domains including data service and solutions, search science, marketing technology, and billing systems. I worked as an architect at IBM before eBay, building technology solutions for various clients. 

2. Can you tell us about your journey into this industry?

I started my career as a developer with IBM, building large systems for banks, telecom operators and securities exchange companies. I was excited by the power of software – you are building your own world when you write software and you can do almost anything.

I moved to eBay and experienced the great turnaround of the company. eBay was in trouble in mid-2009. The share price was at a historic low, well off its near-$60 historical high. It was cutting costs, growth was negative, market share shrinking, and the technology team wasn’t empowered to innovate. By bringing in new tech executives, eBay started to make the engineering team an ideas powerhouse and built it into an equal partner, alongside the rest of the business. The company started the journey to use technology, data and AI to drive business.

I was lucky to join and build the search science team, which was the first team to leverage machine learning to rank items for buyers. We had a huge amount of data and could easily A/B test new models to learn how they worked. Every time we optimized the search model, user conversion would increase, which translated to millions or tens of millions in increased revenue. 

Our team showed the whole company how powerful machine learning and data can be. Data science opened up an entirely different world than the engineering process I was used to.

The new problems I was working on were interesting, but also very different from the engineering challenges I still enjoyed. I hesitated to switch my career to what seemed like an entirely new field. My mentor, a great tech leader who founded Bing’s image and video search team and was leading the big turnaround for eBay, convinced me to go for the new challenge.   

This was a turning point for my career. For the next two years, I spent all my off time and weekends building my machine learning knowledge and picking up statistics. It was an intense period, but I learned the power of machine learning and how it can help change a business. Meanwhile, I continued to lead teams using machine learning and data to improve overall eBay search and marketing experience.

I then joined eBay’s data service and solution team and played a horizontal role to build data solutions for the whole company. We enabled product managers to optimize product experience by data, inventory managers to optimize inventory and price by data, marketing to optimize campaigns through data, CRM teams to engage buyers and sellers through data, and optimize experiment platform to support all the A/B testing. I got the opportunity to enable data-driven decisions for every team in the company. I also built a retail science team and data labs to detect trends and seasonality of inventory, help sellers decide the price for their products and help buyers to find interesting products.

After more than a decade with eBay, I joined as their chief data officer. My team leveraged data and machine learning to optimize travel experience. We made significant revenue increases through search, recommendation, and CRM. We also saved huge costs from using accurate training data for AI in operations and customer service, improved internal efficiencies as well as set up the data foundation for the whole company. 

The more I worked in the machine learning and AI field, the more I realized the importance of data. Developers used to be the “God” of software, while training data for AI was becoming the “God” of AI applications. Appen’s mission is to create large volumes of high-quality training data faster. I strongly believe that mission will help the whole world adopt AI faster in a better way! So, I joined Appen to make AI work in the real world.

3. Tell us a little bit about Appen and how the e-book ‘The 2020 State of AI Report’ came about?

Appen provides reliable training data for AI to give leaders the confidence to deploy world-class AI products. With a global force of over one million skilled contractors, we have diverse data collection from over 180 languages and dialects and 130 countries. The annual ‘State of AI Report’ was introduced a few years ago to examine and identify the main characteristics of the expanding AI and machine learning landscape by gathering responses from AI decision-makers. With the changing environment and increase in conversations around AI and machine learning, the 2020 report is a way for our team to understand how AI is evolving, so we can better adapt to the landscape. It’s a very interesting time for AI, much like the early days of the internet. Many organizations have adopted the use of the internet at the core of their processes, and AI is on a similar journey from fringes to core value offering.

4. What were some of the hypotheses based on the trends you were already seeing?

With AI as a growing trend, we would expect to see companies continue to adapt and integrate AI tech into their roadmaps. COVID-19 has changed everything about the way companies are operating today, but not everyone has adapted in the same way. The State of AI report shows despite turbulent times, organizations do not expect any negative impact from COVID-19 on their AI strategies. Those that are prioritizing AI see the power of digital transformations as a way to improve their resiliency and long-term performance. The report also reinforced what we were already seeing in the market – that accurate training data for AI is the key to its success, with teams updating their models at least quarterly. However, a lack of data or data management continues to be a challenge. Many businesses are still at the early stages of their AI journey and they are finding that their data needs span beyond in-house resources when looking for high-quality, annotated training data for AI success. Industry leaders are turning more and more to third-party providers like Appen to help them deploy their AI projects.

5. What are some of the key findings in this report that will impact the way businesses approach AI and machine learning?

The report illustrates the current state of AI and machine learning, showcasing where the industry is as a whole in 2020 compared to 2019. We found 3 out of 4 companies saying AI is critical to the success of their business while nearly 50% feel that their organizations are behind in their AI initiatives. This indicates that companies are seeing the value in development and leveraging AI, but still have a long way to go. With that increase, we also saw an increase in AI budgets especially with the number of organizations with budgets over $5 million, which doubled from 2019.

6. Are you surprised that respondents don’t expect a negative impact on AI implementation due to COVID-19, given the downturn the pandemic has triggered causing finances to be allocated to more critical requirements right now?

The survey results show that nearly three-quarters of companies find AI critical to the success of their business. That data point is crucial to understanding why so many businesses are continuing to focus on it amid the pandemic. Those that are prioritizing AI see it as a way to improve their resiliency and long-term performance. However, 31% of the respondents stated they were experiencing a delay or somewhat delay to their AI strategy due to COVID-19, which is to be expected, given the breadth of industries impacted by large parts of the economy being shut down for a few months.

7. Why are we now seeing buy-ins on AI programs from the C-Suite, and how will this impact projects?

Top-level executives are becoming more invested and involved in AI projects, because these projects now involve the whole corporation, as AI becomes a viable option to update core business processes.

Projects span many departments within a company from finance to marketing to development. C-suite leaders have the ability to create these cross-functional teams, fund them and enable them to complete AI projects successfully.

8. Why are there differences in the vision of AI tech between business leaders and technologists? How is this impacting AI adoption for enterprises?  

The biggest difference reported between the two was on the key challenges with AI. Technologists stated that lack of training data for AI was the key problem, while business leaders reported that almost 50% less. This could be from a different understanding of the importance of having the right datasets. Misalignment can cause issues when looking at the project plan or overall success of a project. The right organizational structure and making sure all team members are aligned on project details are critical for success.

9. Why is a lack of training data for AI / data management such a huge challenge still? How do you think companies can address that?

Data is continuing to be the biggest key challenge in creating successful AI projects, because of data drift or the fact that data is always changing. We are seeing this today with the change in consumer behaviors due to the pandemic. AI products that are not being constantly updated are at risk of not responding correctly, because they have not seen this type of data before. As each business collects and tries to make sense of more data, and trains their machines to adapt to the changes in society, we will find AI responses improving. Data will always be the key to success and the biggest factor that companies should invest in. After all, you don’t want to be an organization that builds AI that ends up being biased or works for only a type of customer.

10. What are some of the major developments being planned at Appen in the next couple of years?

We will continue to create this report annually to see how the AI landscape is adapting and growing. This year, we are planning to bring leaders together to continue the conversation around AI through a series of webinars called ‘Launching AI in the Real World: A Roundtable Series’. I’m excited to see the executives who join the conversations with us on ‘Responsible AI’, the ‘Goldilocks Problem’, ‘Building an Organizational Structure for AI’, and ‘Life after AI Deployment’. In the next few years, we’re also planning to take steps towards adopting more AI technology as part of Appen’s business processes and improve the way we provide high-quality annotated training data for AI and machine learning systems.

11. What digital innovation in the tech space do you think will leave a mark in 2020?

There are two areas in the ML/AI space that are definitely worth more attention.   

  • The first area is NLP (natural language processing) advancement. Google released the BERT model in late 2018 and there is a lot of momentum in the NLP space since then. The BERT model uses huge amounts of online data and has outstanding performance in almost every language-related task. Today BERT (and BERT’s cousin and nephew models ☺) can be used as a pre-trained model and then fine-tuned in many applications. It helps AI better understand text, understand sentiment, rank better search results, correct human grammar and chat like a human.
  • The second trend we see is around AI platform and tools. Given that AI has become the mainstream in many industries, there is an increased need to help data scientists build, debug, manage, and operate models in production. I do expect to see more cutting-edge products in this domain from both big companies and smaller startup companies.  

12. How do you keep up with the rapidly developing tech world?

I have a huge passion for technology and am lucky to be surrounded by world-class engineering and machine learning talents. I join our team’s tech discussions on architecture and machine learning model reviews and always learned something new. I also try to read one or two papers or tech articles every week. My favorite time is writing code by myself and sharing it with my team.

13. What book are you currently reading?

Jean-Christophe’ by Romain Rolland. It’s a classic that I loved from my college days. It gives me passion and energy for life. 2020 has been a crazy year for almost everyone and I highly recommend this book, as it’s an excellent read in a turbulent time like now.

14. Which is the one go-to-phrase that you have believed in throughout your professional life?

“People deserve opportunities”.

I want to help others bring out the best in them by providing opportunities for them to grow. My career benefited from the support and mentoring by many world-class tech leaders and I enjoy doing the same to the new generation of leaders. 

15. We would love to get a glimpse of Appen’s much-talked-about chic and fabulous work culture! Can you share some pictures of Appen’s office get-togethers and fun activities with our audience?

Here is a picture of a product launch celebration last year – we love chocolate fountains!

Right now, we are all socially distanced working from home, but we are looking forward to getting back to office for those get-togethers. Virtual coffee dates and happy hours have been keeping our teams connected and bringing some fun into our remote life.

16. What are some of the most-used apps on your mobile phone?

Google Podcast, Coursera, Strava, WeChat, LinkedIn, Facebook, Twitter, Philz

Wilson Pang

A software engineering and data science tech leader, Wilson Pang, CTO at Appen, is passionate about driving businesses to succeed through innovation in training data for AI. Before Appen, Wilson was CDO at CTrip in China, Senior Director of Engineering at eBay and a software architect at IBM

Related posts

AITech Interview with Charles Simon, CEO of Future AI

AI TechPark

Interview with Dr. Radoslav Danilak, Co-founder and CEO at Tachyum

AI TechPark

AITech Interview with Mr. Adeel Sarwar, Chief Technology Officer at CareCloud

AI TechPark