How To Write A Research Paper In Machine Learning

10 minute read


Writing a research paper means communicating your idea to the external world. The task is always demanding, but it gets better with practice. This tutorial is for students writing their first research paper in machine learning. The paper can be either your semester project report or a submission to a conference. The tips below cover both cases, so please take the time to study the tips; revisit those tips while writing the first draft of your paper.

Your goal is handing in a well-thought and well-written paper; reaching that state requires multiple steps. There are a number of incredible tutorials and presentations out there (see the end of the post). The tutorial below is divided into two main parts: i) tips on how to get started and write the first section of the paper, and ii) general tips for refining the paper.

Writing one section (at the time)

Let us start from the first round of writing, i.e., populating the first section of the paper. Importantly, papers are not written in the order they appear in the final paper, so you do not need to start from the introduction. In fact, we rarely start writing from the introduction. Then, assuming we want to start from the experimental section, the following the next steps should already provide a first version of that section:
  1. Include the tables/results even if the final numbers are not yet ready.
  2. Write 2-3 sentences that provide the results and comments on them, e.g., does the proposed method perform on par with the baselines? What is the strongest empirical evidence? If your results are not conclusive yet, you can postpone this and the next step until a later point.
  3. Then, write a couple of sentences on why the reader should care for the experiment, and whether the improvement is important, e.g., is it consistent across the experiments?
  4. Then, start developing further the details of your experiments. What are the datasets you used? Could you describe the datasets briefly? Are those standard benchmarks?
  5. Next up, you can write a paragraph (or more) about the implementation details. What are the hyper-parameters that require explanation for reproducibility?
  6. Are there some additional experiments or toy settings that you need to include?
  7. Collect all the text, put it in a logical order (e.g., implementation details should come before the experimental results), and revise the text to make it more concise.

Each of the steps should be easy to write down, so do not move on to the next one, until you complete the previous step. These seven steps are meant as a guide for writing the first round of the first section of your paper, while similar steps can be considered for other sections as well.

Should the first section be the experimental section? Not necessarily, I suggest you to start from the section you are most comfortable writing about. For instance, you should be able to identify the works most closely related to yours and write the differences in the related work. Similarly, the theorems you have proved are already determined and you could write them down in the methodology.

General tips on writing

You have now finished writing the first round, is this the first draft? Not yet, there are a couple of steps that you should follow to make the manuscript more accessible:
  • The paper is not a diary to be written in a chronological order. The fact that you conduct an experiment first, does not mean that this experiment should necessarily be included first; especially if you are writing for a conference submission that has a limited number of pages.
  • Notation, notation, notation. The notation is a critical component of your paper (especially for theoretical works) and it should be consistent throughout the text. A couple of simple rules are the following:
    • Symbols used in equations should be used in math mode in the text, as well. For instance, 'we use symbol K for [vvv]' should be replaced by 'we use symbol $K$ for [vvvv]'.
    • Superscripts are often required, and there are several combinations, e.g., K$^{th}$ or $K^{th}$. Please use $K^{\text{th}}$ in every case for superscripts.
    • If specific styling is used for symbols, it should be consistent throughout the paper. For instance, if vectors are denoted in bold (btw, the package bm is recommended for those), they should be bold throughout the text.
    • Math equations should either finish with a comma (,) or a period (.), depending on whether there is a sentence continuing/explaining after the equation.
    • Use distinctive symbols and avoid overloading notation, e.g., do not use both $u$ and $\upsilon$, unless necessary.
  • Abbreviations:
    • Even though each task has a number of abbreviations, the reader might not be aware of those. Thus, use the full name the first time and define the abbreviation, and then you can refer to the abbreviation in the rest of the paper. For instance, 'Neural networks (NNs)' defines the new abbreviation that can be used then.
    • Define an abbreviation only if you need one; for instance, if you do not use the abbreviation in the rest of the text, you do not need to define it.
    • Abbreviations (including datasets, methods, model names) should be consistent throughout the text. One way to guarantee that is by using aliases in LaTeX. For instance, you can define '\newcommand{\cifarten}{Cifar-10}' in the preamble and then refer to it as \cifarten throughout the paper. Even simple abbreviations can be written differently in different parts of the text, e.g., Cifar10, CIFAR10, Cifar-10; thus, using abbreviations is recommended for consistency.
  • The '\cite{}' commands: Many ML conferences have established the commands of \citet and \citep which can cause confusion. In case those commands exist in your template, here are a couple of tips on using them more effectively:
    • \citet can be used when you want to refer to the authors in the flow of the sentence. For instance, 'the method of \citet{authors}' results in the following text: 'The method of Authors et al.'.
    • \citep is used when you want to include parentheses. For instance, 'NNs can be used for synthesising images \citep{authors}' results in the following text: 'NNs can be used for synthesising images (Authors et al.)'.
    • Replace 'The authors in \citet{authors}' with '\citet{authors}'.
  • LaTeX Labels: First-time users of LaTeX often do not use \label{}, but the label command a great tool in LaTeX enabling you to refer to the equation/figure/section at a later point. The labelling rules I am following are the following:
    • Devote the first few characters on the type of label. For instance `eq' for equation, `sec' for section, `fig' for figure. This is especially helpful in case you are writing in IDEs with auto-complete.
    • The next few characters should describe the role of the equation. Is it the main model equation or just the first model?
    • Follow a similar pattern for labelling. My advice is [type-of-label]:[role]_[additional-info]. For instance, \label{eq:model1_recursive} is distinctive enough for the context of the paper.
    • In the equations, write the label at the end, just before the \end{equation} command.
    • Write a label in every section, every table/figure and every equation.
    • Avoid duplicate labels. Frequently, an equation/table is copied and modified to express a new model/experiment, but the labels are not modified accordingly. The duplicate label creates issues due to the LaTeX compilation. Unfortunately, they show up as 'warnings' in popular cloud-based editors, such as Overleaf, so it would be easy to ignore them.
  • References: The style of the references is often overlooked, however the following simple rules can vastly improve the style:
    • Accepted papers should be cited with the identifier of the conference/journal, rather than the ArXiv version. Attention: Google Scholar often provides the citation for the ArXiv version and not the accepted in the conference/journal.
    • The style and the names of conferences/journals should be consistent. An easy way to achieve that is by defining an abbreviation in the bibtex file and then replacing the booktitle with that. For instance, if you define @STRING{ICLR = "International Conference on Learning Representations"}, now you can write the 'booktitle=ICLR' in the respective field of a citation.
    • Pay special attention to citing the correct papers for datasets and benchmarks.
  • Figures and tables:
    • The captions should be self-contained, descriptive and concise. For instance, writing 'Accuracy on CIFAR10' is not enough; a more descriptive caption could be: 'Comparison of the different architectures when trained on CIFAR10. Notice that the $\Pi-$net requires less parameters than ResNet to achieve the same accuracy, which exhibits the expressivity of the proposed model.'
    • Styling: If you have multiple figures/tables that depict the same topic, it is recommended to use the same styling (e.g., line shape, line color, legends).
    • Make it easy for the reader to understand if higher or lower values are better. This can be achieved by including a dedicated symbol, e.g., $\downarrow$.
    • The best value (per metric per experiment) should be converted into a bold value to be easily identifiable.


I am thankful to Zhenyu Zhu, Zhiyuan Wu, Bohan Wang, Aleksandr Timofeev for the feedback they provided and tips on the LaTeX basics.


The list I compiled above covers only the basics, there is a wealth of additional resources that you can easily access and understand further how to write your research paper.

Video tutorials from incredible researchers:

Booklets and other resources on writing:

Useful resource: 'The elements of style', style guide for writing in English.