A Curated List of Resources for STEM Ph.D. Students
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
A curated list of resources for Ph.D. students. It includes both academic resources mainly for Computer Science or Engineering students and other resources (e.g., personal development, communication, and writing) that I personally feel useful and needed during my Ph.D. time.
Disclaimer I have not reviewed all the resources, but all resources are recommended by me or at least one of my advisor, friends or colleagues.
- Signaligner Pro A desktop-based visualization tool developed by my lab and crowdgames lab used to view multi-day high frequency raw sensory data, arbitrary annotations, and algorithm predictions.
- Fundamentals of data visualization
- Chart Chooser Cheat sheet to choose the proper chart for data
- Data infographics Examples of data infographics
- The Data Visualisation Catalogue
- Data stories A podcast on data visualization
- Tensorflow’s data projector Data projector for high dimensional data by TensorFlow
- Google PAIR’s facet Facet your data into pieces
- Gather Plot Understand high dimensional data using violine plot
- HiPlot FAIR lab’s visualization tool for high dimensional data using parallel plots
- Data Vyu Video annotating
- How many people are around? A python program to estimate the number of people around by WIFI signal
- movisensXS Free experience sampling for Android
- Snorkel Programmatically Building and Managing Training Data
Make sense of data (statistics)
- Understanding Bayes: A look at the Likelihood
- Improving your statistical inferences A great online course to improve your understanding on statistical inference
- See theory Learn probability and statistics through visualization
- Simple Statistics Popular blog talking about data science
- Basic data science lessons with COVID-19 data Such a great illustration article about the importance of investigating data rather than rushing into building models, by Adam Geitgey
- Stats of 1 The collaborative blog of statistics of personalized digital health
- All of Us: health records dataset
- Google dataset search Google’s dataset search engine that indexes more than 25 million public datasets
- datacommons.org Data repo of US census data initiated by Google
Algorithm and Programming
Art of algorithms
- Introduction to algorithms The No.1 online free course for algorithm
- Visualization of LeetCode algorithm solutions
Learn programming interactively
Play Python (玩蛇)
- Using Python for Research (Video course)
- Python cost model Get some sense the time spent on basic python operations
Others’ resources collections
- Command Line Interface Guidelines Good principles to write command line programs.
- Eli Bendersky About system level, and AI programming with Python, C/C++, and go
Open source community
It is always good to participate in open source projects, while we all have benefited a lot from the open source work accomplished by others. During your Ph.D. life, you may find that all the programming scripts you’ve written for your research projects can be turned into some reusable open source software packages to be shared with others, and at the same time get some citation impact on the work that uses your software packages. Here are some good venues you may consider to publish your open source software packages.
- In which journals should I publish my software? A good list of the venues you may consider publishing your software
- How to Write a Git Commit Message Correct way to write commit messages
Learn machine learning
- Machine learning is fun!
- Ideas on interpreting machine learning
- Dive into deep learning
- Microsoft’s introduction to recommendation systems
- CMU’s human A.I. course
- Springer’s 65 free data and machine learning book!
Learn from the Giants
Learn from Q&A
- What tools are good for drawing neural network architecture diagrams?
Start with the state-of-art packages
- ARUS A python package, in development by me, provides a computational and experimental framework to manage and process sensory data or wireless devices, to develop and run activity recognition algorithms on the data, and to create interactive programs using the algorithms and data streams from wireless devices.
- AliPy Latest Active Learning algorithms in a single Python package
- Flair State-of-the-art Natural Language Processing framework
- DAWN DAWN is a five-year research project by Stanford to create tools to democratize AI by making it dramatically easier to build AI-powered applications.
- streamlit “The fastest way to build custom ML tools out of scripts”
- rlpyt A Research Code Base for Deep Reinforcement Learning in PyTorch
Words from famouse scientists
- Jeffrey P. Bigham CMU professor for Human A.I.
- The coming A.I. autumn Start to think A.I. from human’s perspective
- Adam Geitgey Very hands-on tutorials and blogs about A.I. and machine learning applications. His blog articles are well-written and just fit my personal pace of learning
High performance computing
- Stanford Research Computing Lesson A very good lesson teaching you how to use high performance computing clusters
Health with technologies
- PROMIS measures Patient-Reported Outcomes Measurement Information System (PROMIS) and its core citations
- Behavior change measures This provides 186 measures about behavior changes and their citations
- EMA item repository This website hosts a shared repository for common items used in EMA questions. This provides standard and conventional items used in EMA questions (and choices) for a wide range of domains, such as emotion, physical activity, mood, and behaviors
- Ecological Momentary Assessment in Mental Health Research - A Practical Introduction, with Examples in R A good introduction ebook for EMA technology and its application in mental health
Live as a Ph.D. student
Personality and habits
- Learning to think: Becoming a functional PhD student
- Some Modest Advice for Graduate Students
- The Graduate Student Survival Guide
- NirandFar This is official website of the author Nir Eyal, who has published famous books (e.g., hooked) about behaviorial design that may be used in mobile app design. The book was originally recommended by my advisor during the course. Regardless of using the idea in app design, I find it is also important to understand how our brains and behaviors are affected by the technologies around us, both the positive and the negative sides. It is especially important for STEM Ph.D. students to be aware of the impact, because our lives are about creating new technical gadgets and concepts and are literally surrounded by them
- How to take criticism well An article about how to communicate in tough situations (e.g., rebuked by your advisor) written by Sabina Nawaz, who is a famouse leadership coach. I once participated in her PhD communication training workshop at Northeastern
- Unhappy At Work? Persuade Your Boss To Redefine Your Job Another article about how to communicate in touch situations by Sabina Nawaz. It is about getting what you want from your advisor. This seems to be universally useful in all management relationships
Listen to the researchers
- Serial mentor suggestions of being a successful researcher
- Philip Guo Philip Guo’s articles, about general research, teaching, and HCI
- @justsaysinmice Just Say In Mice, a scientific twitter account that corrects the false promise/missing context of research headlines
Start with Grants
Serve in the community
Copyright about your publication
Conduct research studies
- The history and development of N-of-1 trials
- Study Quality Assessment Tools The quality assessment tools for health-related research studies provided by NIH
Solve productivity crisis
The crisis of managing the exploding digital stuffs fed by your projects.
- DMPtool Documenting the management plan before doing research
- How to organize code in Python if you are a scientist Opinionated principles in software engineering for scientific codes. It may also apply to other programming languages
- Try the following organization diagram + a README file recommended by Northeastern University Library
- How to name your digital stuffs (except for scripts)
Identify impact of papers
I will exclude well-known Google Scholar, and Microsoft Academic Search from the list because these are not indexing services and do not exclude predatory journals and conferences.
Web of Science is excluded because this requires subscription. But your school should usually have the subscription. And its citation report is probably the most reliable service to find impactful journals.
- Dimensions AI Provide a better full picture of how the publications are connected with others and other domains
- Leiden Manifesto for research metrics “10 principles to guide research evaluation”. This is about addressing current issues in publication evaluuation
- Think. Check. Submit A neat website helps researchers choose the right trustworthy journals or conferences to publish through a well-organized questionnaire
- Directory of open access journals Search and verify if an open source journal is actually certified
- Beall’s list of predatory journals and publishers You know by its name
- Journal Citation Reports aggregates the meaningful connections of citations
- Jane journal finder finds potential journals based on your title and abstract
Write an impactful paper
- How to Craft Your Article Title to Increase Views and Citations It all starts with a good title that, according to Prof. Roger M. Enoka, can boost the citations and views of your paper
- Hemingway editor I found this online editor is extremely helpful for checking readabilities, especially you use it together with Grammarly. The online editor is free