Getting labeled data isn't easy.
Having reliable labeled data is crucial for any supervised learning task. But given large datasets, manually determining these labels may be intractable. Services like Mechanical Turk can be particularly useful to crowdsource labeling, but it is difficult to guarantee that the workers will reputably provide correct labels. Each worker also has their own subjectivity when issuing labels, and it is important to capture this bias somehow in the model.
Quick Video Summary
The task becomes even more difficult with more classes from which to choose. In the case of labeling flowers, for example, an amateur labeler may not be well versed enough to differentiate between families of orchids, when faced with over a hundred options. There also may be ambiguity between classes. Take the classic example of lions, tigers, and ligers. To the layperson, an image of a liger may fall closer to a tiger on the lion-tiger spectrum, and thus be hard to classify.