There's a move towards doing more sort of cloud based stuff for data science. And it also solves the additional problem where data scientists often need to do remote development, like you need access to GPU's in order to train models. So that actually helps with that reproducibility a lot. Another point which I think is really important for data scientists and can be neglected is literate programming. This is an idea from Donald Knuth. You should write your code in such a way that it's actually understandable later with data science work.