The data will be the one that tells you the story. You might not understand the domain entirely, right? You're trying to gain understanding even in the very worst cases. And if the signal's there, you can usually get away with using really dumb and simple models like decision trees or linear regression. This is sort of, I think, what makes it different, that you're moving towards a goal, but you don't know what that goal is.