Research – A High Schooler’s Take
By Jeffrey Pan, Phillips Academy Andover
In such a dynamic, emerging field like machine learning, research opportunities are plentiful. If you have some machine learning experience, research is a great way to demonstrate your expertise, gain experience working on large projects, and apply your knowledge to solve real-world problems. This article aims to explain the research process in a concrete way for students getting into research for the first time.
First, there are a few important things to understand about the research process. Learning these lessons early on will make you a stronger researcher and help you avoid the common pitfalls that new researchers fall into.
1. Research is not a linear series of steps
It’s important to understand that research is very much a cycle. Oftentimes, you will find yourself going back to redesigning experiments or even searching for another topic. That’s totally normal! Don’t feel like you have to constantly be moving forward in your research to be making progress. Remember that time spent rethinking your research direction and experiments may end up saving you countless hours spent pursuing a less fruitful research direction.
2. It’s about the journey, not the destination
You’ll benefit the most from the research process if you focus on the self-improvement aspect research rather than the end results. Getting promising research results is sometimes a matter of luck — there’s no formula for finding a novel approach that happens to perform better than existing methods. Thus, don’t worry about publishing a paper or getting good results — instead focus on discovering something new and learning to become a better researcher. This process of self-improvement will actually help you find more success in your research than if you focus entirely on results.
The Research Process
Read the existing literature
Research is all about pushing the limits of current knowledge done in a particular area, meaning that you must first have a solid understanding of the area you want to research.
Start with high-level articles and work your way down. High-level overview sources such as blog posts and Medium articles can often provide excellent explanations that will make it easy to understand the basics of a certain field. Don’t stop here though — oftentimes such articles oversimplify the field to be more accessible to a wider audience. To learn more about a field, you’ll want more comprehensive overviews.
Literature review papers
Literature review papers are written for the exact purpose of providing a comprehensive overview of a research field. In most fields, literature review papers are often written by long-established experts in the field, meaning that you can rest assured that the information you’re reading is correct.
Oftentimes, literature review works will also highlight future research directions that haven’t been previously studied. This can help you generate ideas for research topics to study. These papers will certainly be more technical than the high-level articles you’ve read previously, so don’t be discouraged if it takes you a while to get through your first literature review work. It’s critical that you fully understand the first literature review paper you read — if you get stuck, try searching for unfamiliar terms or going back to higher-level articles. Reading papers will only get easier from here on. Once you’ve read a few literature review papers, you should have a broad overview of the research field you want to study.
Foundational papers
After reading several literature review papers, you’ll understand broadly the existing work that has gone into the research field you wish to study, but it’s best to read about fundamental contributions in the field from the authors themselves. Go through the literature review papers you’ve read again. For each research paper the review identifies as significant, find the original paper and read it through. Again, the original paper will be more technical than the literature review work, so don’t stress if you don’t get the paper the first time you read it. In fact, you should never only read a paper once. Reread, highlight, annotate, and look up unfamiliar concepts until you have a solid understanding of the paper. As a test of how well you understand the work, try explaining the paper to someone unfamiliar with it — if they can also understand the paper, your understanding is certainly very solid.
Brush up on requisite skills
It’s very unlikely that you’ve gone into your research field completely prepared to start doing research. After gaining a theoretical understanding of the field, it’s time to make sure that you can apply your knowledge and implement it in code.
For several of the foundational works you’ve just read, try to find implementations online. Many papers will open-source their code, making it pretty easy to find an implementation. Read through all of the code and try to run it. Ideally, you should be comfortable explaining what each line of the code does. This way, you’ll gain a strong practical understanding of the paper you read, meaning that when you expand on it, you’ll be confident in the code you write.
Identify a problem. Deciding on a specific research topic is both the hardest and most important part of the research process. A good research topic has these three qualities:
- New — research is inherently about discovering new things, so your work should differ from the existing literature in a meaningful way. As a high schooler doing research, it’s easiest to do meaningful work when there’s less existing literature discussing your research topic. Although it’s not impossible to put out interesting work on a topic popular among researchers, you’ll have more competition.
- Impactful — the more potential your research project has to directly impact the field, the more interested others will be. Bonus points if your research can be easily applied directly to industry — that’s how many tech startups are born.
- Interesting — counterintuitive and surprising results usually draw more attention than results that are more immediately obvious. Of course, you won’t know the final results until you finish running your experiments, but try tackling problems where the answers aren’t immediately obvious.
Literature review (again)
Now that you’ve chosen a research topic, it’s important to go back and read the existing literature again, this time focusing on your specific research topic rather than the field as a whole. For each paper, look in particular for the following:
- How they design their experiments
- Their experimental results and conclusions
- Limitations and future extensions of their work
As a researcher, you want to ensure that your work builds upon existing research. By looking at how previous researchers have tackled the topic you now want to explore, you can get a better idea of how to conduct your own research, from potential directions to explore how to design your experiments.
Design experiments
The process of designing experiments is crucial — after all, the validity of your conclusions depends on ensuring that your experiments are solidly constructed. Keep a few things in mind when designing which experiments to run:
- What metrics do you care about? Find key, quantifiable metrics that can disprove/prove your hypothesis. Ensure that you don’t get carried away measuring things that ultimately won’t affect the outcome of your research project. This sounds intuitive enough, but when you start implementing your experiments, you may end up down a rabbit hole of collecting lots of results, which may not actually provide much value.
- Are your experiments comparable to existing work? Try to ensure that your experiments are designed in a way so that you can easily draw comparisons between your results and existing results. Even if your results are promising, it’ll be much more difficult for others to understand the significance of your work unless you can easily put it in the context of existing research.
- Are you keeping it simple? Ensure that your experiments are as straightforward as possible while still generating enough concrete results for you to analyze. A simple, easy-to-follow workflow will both make your life easier and help other researchers reproduce your work.
Analyze Results
The process of analyzing your results actually plays one of the important roles in the entire research process. Although it may seem like a separate step from running your experiments, analyzing your results early on can save a lot of pain in the long run. Here’s a few tips for staying on top of all the raw data you’ll be collecting:
- Start processing your results early. Make sure you don’t get so caught up in running experiments that you forget to process results from earlier experiments. After all, the results from earlier experiments will determine whether you need to run more experiments in the first place. By processing your results early on, you can decide if there are issues with your experimental setup or your research direction. However…
- Don’t be overly influenced by early experimental results. This may seem to fly in the face of the first point, but it’s important that your experimental setup is driving your results, not the other way around. Don’t try to “hack” your experiments so they return good results — besides the massive ethical issues, you also won’t learn how to become a better researcher. In short, process your results early to find any glaring flaws in your setup or hypothesis, but you should only change direction if there are major issues.
- Try to process and visualize your results in different ways. Even if your hypothesis doesn’t appear to be true, don’t give up! Try visualizing your results in different ways. You may discover unexpected trends!
Conclusion
In summary, the research process is a lot to deal with, especially if it’s your first research project. This guide will hopefully give some more structure so you feel less lost. However, the research process is ultimately yours — make it your own, feel free to break some rules, and remember that you’ll gain the most by treating it as a learning experience rather than a quest for academic glory.
About the author: Jeffrey Pan is a rising senior at Phillips Academy Andover. As a young AI enthusiast, he has been conducting ML research since 9th grade. In the summer of 2018, he worked with Prof. Qixing Huang’s group at UT Austin Graphics & AI Lab, co-authored a paper on the relative pose estimation problem in computer vision, and presented at CVPR ’19, the premier conference in computer vision. Since June 2019, he has been researching in Prof. Song Han’s lab at MIT on neural network compression techniques and adversarial machine learning. He has recently published a paper “On Intrinsic Dataset Properties for Adversarial Machine Learning” and won a Best Paper Award at AdvML’20 workshop.