Business, government, and science researchers are producing massive amounts of complex data. The availability of these huge datasets fuels a need for both data-driven analytics and a 21st-century workforce that can use data analytics to answer questions and solve problems. This collaborative project will develop a cloud-based virtual platform to train undergraduate students how to use software tools essential to data science. The platform will make state-of-the-art computing resources, including both powerful data analysis tools and parallel hardware systems, more accessible to students and faculty, even if they are at institutions without locally available high-power computing systems. The project aims to help students develop critical workforce skills in data science. The project will also provide professional development opportunities to help faculty use data-analysis tools in their courses and research.
The goal of this project is to develop a cloud-based infrastructure in the form of a virtual science platform with related training modules. First, it will leverage an existing framework for building web applications to provide broad access to open source, high performance computing resources at the collaborating universities and through the NSF Extreme Science and Engineering Discovery Environment. The cloud-based platform will support both training of students and collaboration among students. Second, the project will produce a data science curriculum targeted to undergraduate students. The curriculum will also be suitable for graduate students, post-doctoral researchers, and information technology professionals interested in data science. The project will deliver a full set of interactive documents and video tutorials on using and configuring the platform. The educational activities will use graphical, interactive, simulation-based, and experiential learning components to teach data science concepts and computing skills, accessed through the cloud-based platform. Through the platform, students will have the opportunity to learn how to use powerful data science resources, enabling their potential to transform data-rich computer science and engineering problems into practical solutions. Third, the project will deliver professional development for faculty at multiple institutions, to help them learn how to use data science in their classrooms and their own research. This project addresses national interests by making state-of-the-art computing resources more accessible to students, supporting their development of critical workforce skills.
Lorem ipsum dolor sit amet, consectetuer adipiscing elit, sed diam nonummy nibh euismod tincid unt ut laoreet dolore magna aliquam erat volutpat
Ranjini Subramanian, Summer 2018 - now
PhD Topic: Visual Analysis and Machine Learning for Sequential Pattern Mining from Multimodal Data Sets
This PhD dissertation work will focus on the extraction of statistically reliable events and their associations (i.e., sequential patterns) from large-scale time-series data, with the use of information visualization interfaces, sequential pattern mining algorithms, and scalable computing techniques.
The deliverables of this PhD research work will include reports, research articles that document the research outcome, as well as software components and integrated interfaces to plot time series data, highlight frequent events and phenomena and quantitative methods to extract the temporal properties of these events. To address usability and scalability, the framework will support the use of high-performance computing resources to allow the proposed visual and computational methods.
Summer 2018: Together with other CECS labs, we hosted high school STEM teachers in a 6-week RET program for the STEM teachers to participate in our research activities with faculty and phd students in labs. Project: learning and using R for scientific visualization.
Last updated on March 15, 2019.