Tuesday 2 February 2021

Information overload in research


In academia, knowledge is stored in a variety of locations. These traditionally include, books, journal articles, lectures, conference talks and abstracts. There are now a huge number of other locations to find knowledge including websites, YouTube videos, preprint servers, blogs, audiobooks, massive open online courses, lecture notes online, online repositories, GitHub and many more.

All of these sources can lead to information overload for academics. It can seem like a never-ending stream of new information. Another issue is that we are bombarded by so much other non-academic information that it's easy to get information fatigue and the thought of reading papers on top becomes challenging. 

This blog post is about how I process the information I receive as an academic and how I reduce the amount of unnecessary information I'm exposed to so that I don't get overloaded. I will try to explain how I developed my own methods so that instead of trying to emulate them you can see the sort of decisions I made and decide for yourself how best to streamline your information channels. This is intended for those who are in research but I am sure some of the insights will be applicable for those in other information-heavy jobs.

Why do I prioritise reading the literature?


Although it can be a lot of work, I do prioritise reading the literature from my field. I personally enjoy this, which helps me stay motivated to keep up to date with new publications. Do you get a buzz out of finding out new information? While it can be difficult to dedicate time to this, the feeling of learning something important can motivate us to prioritise the literature. 


Another significant reason I prioritise reading literature is that I am lazy and the thought of starting a project, writing it into a paper and then realising that someone else has done the exact same thing drives a lot of my reading. There is too much reinvention of the wheel in research. In some ways, it is helpful to be able to demonstrate reproducibility and rarely will two people do exactly the same work. However, more often than not we repeat experiments to death and waste our time and others' resources.

There is a trap that if we see something attempted in literature and it does not yield the result we were expecting, we can think that it is not worth trying ourselves. Basically, we get stuck in ways of thinking that we never challenge. The antidote for this, I think, is reading more widely, especially looking just outside the boundaries of your field. I find if I do this instead of limiting my views they are broadened. Another antidote is not taking any one piece of information too seriously but waiting to see whether data is reproduced in different configurations. This is important for both positive and negative hypotheses. So a healthy bit of skepticism is needed and willingness to try something out you think might work.

Finally, reading the literature gives me lots of ideas. You may be familiar with the post-academic conference feeling of having so many ideas. A good conference exposes you to a wide range of interesting insights that broaden your view of the field and other topics. If you are constantly reading the literature then you will do better research and be able to understand how it fits more broadly into the literature, which will help to get your insights seen by a wider audience. 

Where to start?


When you approach a new topic it is often hard to know where to start with the literature. Here are some tips I have found useful: 
  • Textbooks are a good place to start. Head to the library and pick up some books. Find an approach to the topic that fits with your background and carefully read that book, make notes, do some of the examples. I quite like trying to plot some of the figures in the book in python to really prove to myself that I have understood the concepts.
  • Talk to people from that field. Look online - are there any free conferences/courses you can attend. Are there any workshops with recorded lectures. Can you attend a course in the department? You can usually sit at the back and audit the course. Attending a conference in person is very helpful for getting to know the cutting-edge research.
  • Find out who are the leading groups in the field. Look at their website. See if they have any YouTube lectures. Which conferences do they attend? What have they published recently? How do they talk about other groups' work? 
  • Theses are often very good places to start. Read theses written in your group or department. Most theses are available electronically on university library websites.
  • Can you find a review paper on the topic or related topics? Review papers can be a bit hit-and-miss. There are roughly two types of review papers: those written by junior academics with minimal input from a senior academic and those that are written with input from someone who has spent their whole career in the field. It is pretty easy to tell. Another good sign is if many senior academics have come together to author a review. I will only read the really good reviews in full. As with the textbooks make notes and try to reproduce the information. Review articles are very good for finding the big questions in the field and setting future research directions.
  • Don't just stop with Google Scholar. Microsoft has a search engineCORE is for open source papersUS federal science has lots of reports and presentations that you will not find on google, PubMed is good for biology and medicine, Semantic Scholar is designed to get the most important papers in a field, JSTOR is also good for older sources
  • Your library will have a nice search engine that you should look at also. Libraries have great resources like interlibrary loans that let you get some of the most recent books into your library to read for a small fee. Libraries will also have courses on research, referencing and software to use and I thoroughly recommend getting this training. 
  • The internetarchive is a great resource with many older books available online for free. They also host the waybackmachine which can show you websites from a previous time. This can be very helpful for tracking down academics that have since retired or moved to other institutes.
  • Look at the references in the introduction of journal papers. In many fields, there will be a set of papers that are always cited. These are usually quite a good place to start. What are the big questions in this field at the moment and what are the seminal papers? 
  • Some education journals are very helpful at explaining complex topics with simple examples that have been developed with students in mind. Sometimes I find helpful insights from these education journals like Journal of Chemical Education.
  • I usually consider whether the journal has a good community reviewing the content. Often an unrelated journal will let a paper be published that a more domain-specific journal would have allowed. I take this into account when I read a paper. You can also look at how people cite and talk about a paper to see how it is viewed in the community. Conferences are also good at seeing how results and values are interpreted. It is important to remember that the peer review process is not foolproof yet the scientific community goes to a lot of effort to make sure rigorous work is done. I recommend following some of the scandals around paper mills and lack of replication as a warning of what not to do.

Information pipelines

One helpful approach is thinking about your information flows. This will help with the consolidation and archiving of information into searchable content. Below is a drawing of my information flows. 


I make use of an RSS feed server called feedly.com to get the titles and abstracts of papers and blogs that I can then decide whether I want to read. I also get Google Scholar alerts for specific keywords and authors that I am following. 

For all of the papers I want to read I will download their PDF into a folder on Dropbox. From there Mendeley automatically searches through the document and extracts the bibliography information. I also hand prepare a bibtex file when I am writing papers, in the typesetting software LaTeX. For LaTeX I use Overleaf, an online LaTeXserver that lets you collaborate on papers together with others. I used to have a folder system to sort papers but I found this unhelpful. 

There is lots of information that does not fit into the category of easily downloadable, for example websites and YouTube lectures. For these examples I have a Google document written like a journal entry. I always start the new day at the top of the Google document and then add in the URL and perhaps a picture into the document with some notes. This acts as my online lab book. 

For more intense activities such as the writing of a review paper, I will make use of a Google slide. This allows me to include references to papers through a URL to the paper online. This allows for a better aggregation of figures and charts that are difficult in a normal document. I also find the slides easier to move around when trying to sort out the order of the review narrative. 

Another information pipeline that I find very helpful is following the papers that are referenced in a paper I enjoyed reading. What papers do they cite and value? I often like to check whether the referenced paper really says what the authors are claiming. Are they overstepping its claims to support their hypothesis? I find it helpful to talk to others about what they thought about a paper and what shortcomings they could see.

I have tried many different software packages and options such as Evernote, Word documents, hand writing out notes and Microsoft OneNote. The important thing is to find a set of tools that you want to use and avoid overcomplicating things. The use of online software like Google docs and Dropbox greatly simplified my information flows as I could access them from anywhere.

It is important to regularly reassess this information flow. How resilient is it? I recently had my iPad stolen at an airport. All of my notes were saved automatically to Dropbox, however, my handwritten notes on my iPad were lost. After this, I set up automatic archiving of my handwritten notes through the Notability app. Another question to ask is how many times do you deal with the same piece of information without actually reading it? Can you reduce this by automating or simplifying your process? 

Reducing distractions and removing barriers


Reading and processing information is hard. I find myself getting distracted or not being able to start because it is hard to start. Here are some tips I use: 
  • Get a website blocker on your computer. I use the Chrome plugin BlockSite. Add your social media and news sites. Without blocking software I get nothing done. To keep social media separate I recommend software that can aggregate all of your social media channels such as Rambox. This keeps your emails and family Whatsapp group out of your work and also helps you not to miss anything. I have do not disturb enabled on my phone during the day so that I am not distracted by emails, messages, etc. I have also culled most social media apps from my phone. 
  • Use a timekeeping approach to focus and take regular breaks. The approach I use is the Pomodoro technique with 25 minutes of focused work with a 5 minute break to follow. This helps to avoid decision paralysis where you have too many things to read/do. I find that writing down a task and doing it for 25 minutes without distractions helps me to get started. I use the website Pomodoro. I also only check my emails and news during my break. 
  • Use the text-to-speech add-ons in Chrome or on the iPad to read out the paper for you. I find this helps me when I am feeling lazy. Often after having it read out to me for a few minutes I can concentrate enough to finish reading it without highlighting a paragraph and having it read out.
  • Read things on a tablet or a e-reader. Reading in a comfortable chair with a tablet is more attractive than reading on my laptop screen. Using the iPad with the pen input also allows quick annotation. 
  • Get together with friends to read things e.g. journal clubs. This forces you to read the paper and process it by explaining it to others. 
  • If you have an idea during reading a paper write it down and then carry on reading. 
  • See if your university allows you to use a VPN. This can sometimes avoid having to log in to download the paper for every single paper. This speeds up the process with academic papers.
  • If I cannot access a paper, I usually head to ResearchGate and see if they have uploaded a preprint. Sometimes the university library will require the authors to archive their paper preprint. If all else fails I will email the corresponding author. I have yet to have someone reject my request. 
  • I regularly trim the RSS feeds that are not useful and unneeded.

Particular insights for the chemical sciences

As this is my area of expertise I thought I would explain some tips for reviewing the literature within the physical chemistry subfield of chemical sciences.
  • Don't know the name of a molecule? Use the chemspider structure search. Alternatively, you can build the molecule in MarvinSketch or Avogadro and view molecule properties or search for IUPAC name. 
  • There are often many names for the same molecule so try them all in search engines. Often different fields will use different naming e.g. nanographenes, pericondensed aromatics, carbon flakes etc. Also try reaction classes or general naming strategies for the structures you are interested in e.g. aromatic, aliphatic, arynes etc.
  • There are a number of databases of chemical information that are worth checking out (see the review of these from my colleague Angiras Menon). Some of my favourites are PubChem, https://cccbdb.nist.gov/, protein database and Cambridge Crystallography Database.
  • Search for software on GitHub; there are lots of very helpful packages. Academic websites also include code that is free to use. 
  • Many journals and governments require data associated with journal articles to be archived in open repositories. It is worth looking for input files or data before measuring/calculating it yourself. 

What about news and social media?

I am undecided about the role of social media. I have found ResearchGate and LinkedIn most helpful for connecting me with researchers and research. However Twitter, Facebook and Instagram I have found more distracting. I know quite a few academics who share their new papers on Twitter and journals as well but I find the mix of work and personal uncomfortable.

The news media is a little easier. I like catching up on the news. Often it can be helpful in understanding trends in government spending that will directly impact me as a researcher and it is important to be well rounded in my knowledge. I have tried to balance my news so as not to become polarised by it. I find media bias charts to be helpful in getting multiple perspectives on an issue. However, I have made it difficult to access news on my desktop computer as it is more distracting than helpful. 

For specific science news sites e.g. PhysOrg, ScienceDaily I will pull the RSS feed into Feedly to get articles that might be interesting. I also find Scientific American, National Geographic, Chemistry World, The Conversation, Education in Chemistry and C&EN to be good sources of general knowledge about science topics.

Information flow as a publisher of research

As an academic, we are not just readers of information but we are also publishers and therefore there is another important aspect of considering how your research can be most effectively communicated. This means putting yourself in the shoes of a researcher interested in your topic and thinking about how to make your insights accessible. Journal articles are still considered the highest quality output that has been vetted by your academic community but after this, it is important to consider how you communicate that research through your groups website, talks, conferences, press releases etc. However, this is a conversation for another blog post. 

No comments:

Post a Comment