I usually write the short story series Out There on this blog, but I recently attended a talk at North Carolina State University where I am starting my chemistry graduate studies, and I thought I would share my thoughts on it.
The talk was given by the Welsh chemist Dr. Antony Williams. He earned his PhD from the University of London in 1985. He came to the US in 1992 to work for Eastman Kodak. After leaving Kodak, discontented with the CEO’s comments that digital photography posed no threat to the company, he founded ChemSpider, a database of chemical information that he sold to the Royal Society of Chemistry in 2009.
Williams advocates for open access of all scientific information, both mundane facts such as a substance’s melting point, to new data from cutting-edge research, such as steps in the synthesis of compounds which may lead to anticancer drugs. Williams argues that all data sharing is good, and is necessary to advance the field of chemistry. He currently lives in the Raleigh area. His talk can be classified under the umbrella term “cheminformatics,” which is the use of computers and information systems in the field of chemistry.
Certainly I agree that platforms such as ChemSpider do chemistry a valuable service. We can quickly and easily look up structure, boiling point, etc. on over 30 million chemicals. They are presented in a standardized format, so hunting around for a long period of time for information is not necessary, and we can use this information in planning our work in the lab. However, Williams also advocated for the need for research endeavors to be broadcast to the masses on multiple platforms before publication. Every page in my lab notebook, every NMR spectrum I ever obtain, the details of every reaction I run and why I ran it, every PowerPoint presentation I ever give, on and on, should all be put on my LinkedIn profile, blog, Twitter account, and many other platforms whose existences I was unaware of before the talk.
A fellow first-year grad student asked the excellent probing question: if Williams thought that this hyper-sharing of research data would lead to the demise of traditional scientific journals. The way it currently works is that you amass data from your research, write a paper on it, and publish it in a scientific journal. These journals are published both by non-profits such as the American Chemical Society or the Royal Society of Chemistry, or by private publishing companies such as Elsevier and Springer. You put in what you want, and thus have control over what information the readers see. The publisher also has influence over what the reader sees, or whether they will publish it at all, and of course, one must either pay a subscription for the journal or be affiliated with an institution that pays for it. Even though the papers are available online, commenting is not allowed (although if you read any YouTube video’s comments, this may not be a bad thing). Williams did not really give a straight answer to the student’s question; he seemed to harbor some resentment towards the publishing companies and may wish that this demise would happen, but I will digress no further.
But this raises a key question for me, as a young scientist in this day and age. It is considered socially acceptable for me to broadcast to the entire world that I am sitting in a bar in Raleigh, North Carolina drinking a Ballast Point Sculpin IPA, for example. Is it professionally, or ethically acceptable of me to broadcast to the entire world my group’s proposed synthesis, or the data we have worked hard to obtain? According to Williams, I can receive suggestions on Twitter within a few minutes if the proposed synthesis is genius or bogus. And the NMR spectrum I obtain, while maybe worthless to me, might benefit someone else. But it could also compromise our group’s position. Unlike traditional journals, I have no control over who receives it, so I have to assume someone with nefarious intentions will. Some group in China, India, or worse, Chapel Hill may substitute my data for a failed experiment of their own, and publish first. This would not be fair to myself, my group members, or my professor.
However, it is also true that the group somewhere else may take valuable information, learn from my mistake, and develop some wonderful molecule that solves all of the world’s problems. Clearly, this is a complex issue, and being completely to one side or the other is inappropriate. While we need to be cautious, we should not let paranoia get completely in the way of scientific collaboration. We must ensure that whatever information we share and how we share it will foster a symbiotic relationship between scientists, not a parasitic one. If we are to broadcast info to theoretically everyone, it should be info that would not undermine our group’s competitiveness. Post an NMR spectrum whose structure we cannot solve and solicit suggestions, but do not give details about how we made the compound or what we are working on. Save more sensitive info for more trusted colleagues. Just like how I might share what beer I am drinking, but not the number on the credit card used to purchase it.