Data science has never been as much about machine learning as it has about cleaning, shaping, and moving data from place to place.
Here are the important concepts in data management:
Sources - how to get training data
Labeling - how to label proprietary data at scale
Storage - how to store data and metadata appropriately
Versioning - how to update data through user activity or additional labeling
Processing - how to aggregate and convert raw data and metadata