~1 min read 0 comments

Kyle and William at SC'16 We explore how recommendation techniques can be adapted and applied to big data science. Using Globus we derive features specific to big data science and develop a set of data location prediction heuristics. We combine these heuristics into a single recommendation engine using a deep recurrent neural network. We show, via analysis of historical Globus data, that our approaches can predict the storage locations used in user-submitted data transfers with 78.2% and 95.5% accuracy for top-1 and top-3 recommendations, respectively. We presented this work as a SRC poster and an extension as a workshop paper at Supercomputing ‘16. The SRC poster won Best Undergraduate Poster! My amazing mentor Kyle Chard also won the Early Career Researchers in High Performance Computing Award!