Suggested readings

This page contains a list of links to articles or generally any text on the internet that I have found insightful and interesting along with a short comment on the text under the link. These are all my personal comments and ideas on the respective link and are not directly related to my own professional research.

A plea for lean softawre (Computer magazine 1995, DOI:10.1109/2.348001)

This is a very nice read for people who are writing software. Here are some nice quotes to clarify the purpose of this nice article:

Has all this inflated software [in the 90s, compared to the much more efficient/simple code that was written in the 70s or 80s] become any faster? On the contrary. Where it not fo a thousand times faster hardware, modern software would be utterly unusable.
The following two laws reflect the state of the art admirably well:
- Software expands to fill memory (Parkinson).
- Software is getting slower more rapidly than hardware becomes faster (Reiser).
When a system's power is measured by the number of its features, quantity becomes more important than quality.
Increasingly people seem to misinterpret complexity as sophistication, which is baffling -- the incomprehensible should cause suspicion rather than admiration.
Complexity promotes customer dependence on the vendor.
Customer dependence is more profitable than customer education.
Hardware's improved price/performance ratio has been achieved more from better technology to duplicate (fabricate) designs than from better design technique mastery. Software however is all design and its duplication costs the vendor mere pennies.
Instead, [of iterating the design to reduce the complexity of the software], software inadequecies are typically corrected by quickly conceived additions that invariably result in the well-known bulk.
The time pressure that designed endure discourages careful planning. It also discourages improving acceptable solutions.
The tendency to adopt the "first" as the de facto standard is a deplorable phenomenon.
Meticulous engineering habits do not pay off in the short run, which is one reason why software plays a dubious role among established engineering disciplines.
When "everything goes" is the modus operandi, methodologies and desciplines are the first casualities.
Today, programmers are abandoning well-structured languages [like Modula-2 and Ada] and migrating to C [where the compiler doesn't due secure type checking which can catch many contextual problems early-on].
Abstraction can only work with languages that postulate strict, static typing of every variable and function. In this respect C fails -- it resembles assembler code, where "everything goes".
The most demanding aspect of system design is its decomposition into modules.
The belief that complex systems require armies of designers and programmers is wrong. A system that is not understood in its entirety, or at least to a sufficient degree of detail by a a single individual, should probably not be built.
Prolific programmers [judged by "numer of lines ejected per day"] contribute to certain disaster.
Programs should be written by human readers as well as for computers.

Niklaus Wirth also describes the proof-of-concept solution to these: Oberon (the language and operating system). I reached this article through the Drowning in code: The ever-growing problem of ever-growing codebases that was published in "The register" in 2024; which is itself a very nice read also: reminding us that this problem of 1995 still exists.

The Experiments Are Fascinating. But Nobody Can Repeat Them (New york times, November 19th, 2018)

Andrew Gelman discusses the implications of not doing reproducible research, and the implications it can have. Last year, he also coauthored a similar title for the scientific community in Nature (titled Five ways to fix statistics, and is also discussed below in this page). He summarizes the cause of this problem in this statement: "The big problem in science is not cheaters or opportunists, but sincere researchers who have unfortunately been trained to think that every statistically “significant” result is notable".

Magical thinking about machine learning won't bring the reality of AI any closer (The guardian, August 5, 2018)

This article basically introduces Ali Rahimi's talk upon receiving the "Test-of-time" award at NIPS (Neural Information Processing Systems) 2017 conference. Having played an important role in this field from a decade a go, Ali Rahimi gives a very thought-provoking argument against putting too much faith in machine-learning results. Because it is so dependent on even the smallest change (as one example he mentions changing the rounding function which significantly changed the result) and operates as a black box (which you cannot understand and involves too many random factors). He compares the current status of machine-learning to Alchemy: where progress was done by trial and error, not looking into the foundations of why something works.

This talk provoked a rebuttal by Yann LeCun, and some followup debate which you can find with some searching. Here is a summary, also see the comments on this Reddit page if you are more interested. Yann's argument is also interesting and reminds me of Thomas Kuhn's pre-paradigm phase in The structure of scientific revolutions (published in 1962). Yann's argument thus makes research in the foundations of machine learning very exciting (now that it is not yet a paradigm, and we are creating the paradigm now). However, I don't feel that Yann's argument contradict Ali's: the way I understood it, Ali is not suggesting to abandon machine learning, he is just advising us to be cautious and think about the foundations also. Given the current status of almost religous-like faith in machine learning results by a growing number of people, I feel that this is a valid advice.

Once children were birched at school. Now they are taught maths (The guardian, June 15, 2018)

A critical review on the role of examination (and in particular Math exames) in education. Some nice parts of the article: "An exam is like a Dickensian birching. It asserts power, and hurts". "This boy is a victim of the overpowering cult of maths, which modern education is as obsessed with as the ancients were with Latin". "... maths has been turned into a state religion, a national ritual, and for one reason alone: because proficiency in maths is easy to measure."

How to grow a healthy lab (Nature, May 16, 2018)

Nice collection of aritcles in Nature that describe good practices to manage (and be a productive member of) a scientific team (lab). The illustrations below are by David Perkins taken from the webpage of three of the articles.

Are you really the product? (Slate, Apr 27, 2018)

An interesting article in the current climate of criticizing the free (as in "free drink") internet tools. It starts with a very interesting and thorough historical review of the "you're not the customer, but the product" statement, tracing it back to the 1970s! It goes on to discuss the changed circumstances (the comparison was interesting) and concludes with the fact that that users of these services are actually "paying with our time, attention, and data instead of with money". Therefore arguing that the "you are the product" statement is not the best type of criticism to the "free-ness" of these services, and may even be misleading. The points raised in this article, and the many links to interesting articles on similar topics are thought-provoking and interesting. As someone who has used the "you are the product" statement several times in personal discussions, I cannot (currently) disagree with the points raised here and will try to find a better way to express my concern after reading this article.

But ultimately, my personal problem with this article and The problem with #DeleteFacebook (cited in this article, also published in Slate) is more fundamental than the issue of personal data and time. Its on the herd/mob mentality which is very contagious in humans: in a group, we act/think differently. Being too close (not necessarily physical, through social media for example) to too many people, and thus being exposed to a constant flood of stimuli, will erode the personal uniqueness of each one of us to the common denominator of the community, herd, or bubble we have defined around us (not a single common denominator of the 1970s). Over time, this will result in less creativity and progress (like a limited gene-pool, resulting in less diversity in mutations and thus weaker/less-effective evolution). Ofcourse connections and peer feedback/pressure are necessary aspects for a thriving modern life. It is their exaggeration into the overconnected nature of such tools that attracts our herd mentality so deeply and is the problem: also see The Tyranny of Convenience (discussed below).

Martin Luther King: how a rebel leader was lost to history (The Guardian, Apr 4, 2018)

A very intriguing and critical article, for the 50th anniversary of Martin Luther King's assassination. It is focused primarily on the legacy of his achievements and his contemporary critics. Trying to show how they are so badly interpretted today in the annual "orgy of self-congratulation, selectively misrepresenting King’s life and work". For example, a provocative (and true!) point that this article brings up can be summarized in this question: "what is the value of being able to eat in a restaurant of your choice if you can’t afford what’s on the menu?" In its core, this article is a critical view on the nature of history: "History does not objectively sift through radical leaders, pick out the best on their merits and then dedicate them faithfully to public memory. It commits itself to the task with great prejudice and fickle appreciation in a manner that tells us as much about historians and their times as the leaders themselves.".

The Tyranny of Convenience (New York times, Feb 16, 2018)

This is a very intriguing, thought-provoking and critical analysis of our desire to pursue convenience and how that affects us. The discussions brought up here resonated with me a lot, because we have exactly the same problem in science. The convenience of modern analysis tools like fast processors, the internet, and huge data collecting facilities (for example telescopes in astronomy) have also greatly industrialized and homogenized the sciences/scientists, thus removing their critical agency. This part of the conclusion describes the effect very nicely: "An unwelcome consequence of living in a world where everything is 'easy' is that the only skill that matters is the ability to multitask. At the extreme, we don't actually do anything; we only arrange what will be done". I particularly enjoyed the historical contexts discussed during the article.

The Fields Medal should return to its roots (Nature, Jan 12, 2018, in PDF)

Reviewing the history of the Fields medal in mathematics, the author (a history researcher) makes a very interesting analysis of how the selection criteria have evolved and their effect on the way the public views mathematics and also the field itself. Observing that "this idea of giving a top prize to rising stars who — by brilliance, luck and circumstance — happen to have made a major mark when relatively young is an accident of history. It is not a reflection of any special connection between mathematics and youth — a myth unsupported by the data", in another place the author notes that "In fact, Fields wrote, the committee 'should be left as free as possible' to decide winners. To minimize national rivalry, Fields stipulated that the medal should not be named after any person or place."

Five ways to fix statistics (Nature, Nov 28, 2017, in PDF)

This is a short analysis along with some interesting solutions by some of the most well-known names in statistics. The spirit of this paper was reminiscent of Anscombe's 1973 paper in The American Statistician where he introduced his famous quartet to argue that "Good statistical analysis is not a purely routine matter, and generally calls for more than one pass through the computer". Almost all academic researchers are familiar with these problems, but mostly choose to ignore it. Prompting this group of highly recognized statisticians to speak out again after 44 years. The image below is from this article on Nature's webpage.

The introduction to this paper makes the very important observation that "We need to appreciate that data analysis is not purely computational and algorithmic — it is a human behavior". The interesting point that they raise here is that the statistical methods used regularly in papers today were defined in a data-poor paradigm/era. Therefore "Researchers who hunt hard enough will turn up a result that fits statistical criteria — but their discovery will probably be a false positive".

No more remembrance days - let's consign the 20th century to history (The Guardian, Nov 9, 2017)

This is an interesting perspective on how "Remembering is easy. Forgetting is hard" (discussed in a political context in this paper). History is indeed very important and understanding to interpret it properly (in particular the contexts) is the key. In short, understanding/learning context/setting of a historical event is more important than the actual event. So when the contexts of an event have faded away and no longer exist, we should also put it in the history books (study it academically) and not re-live it any more. Of course, this doesn't just apply to politics, it also applies to the sciences (which has its own history, which is not independent of politics), and how we interpret nature. It is harder and not so clear in the sciences (except in the times of revolutionaries like Galileo or early 20th century in quantum physics). The article concludes really nicely: "It is time to remember the future".

How We Find Our Way to the Dead (New York Times, Oct 28, 2017)

I enjoyed reading this article. It is generally about how we are so tempted to associate more meaning to nature than the raw data. It brings up some serious and interesting examples of taking pictures of, or sending telegraphs to, ghosts in the 19th century. Of course, the same kind of problem exists in the sciences and unfortunately (due to the very similar training of all scientists in a special field) it is much harder for us to find exactly what parts of our models don't come from the raw data and are mainly our wishful thinking. I really like how this article ends: "we can only wait for our ability to detect invented entities to catch up with our talent for creating them". As Francis Bacon (1561-1626, who defined the "scientific method"), nicely says in the New organon: "The human understanding is of its own nature prone to suppose the existence of more order and regularity in the world than it finds". In another part he nicely says: "Anticipations are a ground sufficiently firm for consent, for even if men went mad all after the same fashion, they might agree one with another well enough".

Nobody listened to Luther at first. That's why he succeeded (Washington Post, Oct 26, 2017)

As the title suggests, this article is about how the expected speed of publishing/results in the modern world is in fact very similar to a kind of censorship: "Luther's legacy as one of history's most influential thinkers shows us that there are certain epic projects that require time to mature and space to germinate before they are safe for universal exposure. Without that window, they die". This article is part of a series, for the 500th anniversary of the Protestant Reformation. Continuing, the author says: "I envy [Luther], too. Five hundred years later, there are few writers, artists, designers or intellectuals who do not feel impelled to deliver regular updates on their work online, or at weekly grad seminars, shareholder meetings or workshops with colleagues... These networks make us more professionally productive and accountable. But they also can make us more cautious, since we know that any new idea can expose us to instant censure from complete strangers in other parts of the world who know nothing of our local circumstances... it serves the interest of the orthodox and frustrates the heretic". As discussed in a few other suggested readings here, I believe the danger of this hyper-connected culture also threatens the creativity/criticality in our scientific studies of nature.

History: Science and the Reformation (Nature, Oct 26, 2017, in PDF)

This is an interesting article providing some evidence that the reformation movement in Europe probably didn't cause the scientific revolution. Rather, he argues that they were both the cause of advances in technology (geographic discoveries, printing press and trust in mathematics). In my view, the evidence provided does indeed sufficiently back the main argument of the article and is useful. However, the author puts too much faith in the objectivity of science (that personal/religious belief doesn't affect one's scientific activities). I think this partly derives from the fact that the author isn't a scientist himself. The author is also too confident of the results of his argument and categorically affirms/rejects a point (of the many examples, here is one: "The link between the Reformation and the scientific revolution is not one of causation."). A true scientist will never make such categorical claims (is open to possible future evidence) and would rather end the statement above with "... does not appear to be one of causation". The "American exceptionalism" view of the author is also clear in this statement "there was no word for discovery in European languages before exploration uncovered the Americas". It ignores the discovery of ocean routes routes to the east for example by Vasco da Gama (who found the ocean route to India in the same decade as Columbus). These discoveries to the east were arguably much more economic and culturally influential to the late medieval Europe than those to the west.

John Kelly Suggests More Americans Should Have the Honor of Serving. He's Right. (New York Times, Oct 24, 2017)

This was an interesting and short article in New York Times on an important topic: conscription. It raises a point that I also agree with but many people don't notice: a country's military should not be separate from its people. In Iran (where I come from), conscription is mandatory and is a major issue for the male teenagers and young adults. Since most advanced countries don't have conscription, people there often associate it with an old way of thinking. But as this article points out (in the "Editorial notebook" section of New York Times' opinion section): "Requiring everyone to serve in some fashion ... would be a profoundly democratizing action". This issue was first brought up for me in the documentary The fog of war, where Robert McNamara (secretary of defense during the Vietnam war) mentioned how the support for ending the Vietnam war increased as more and more people were conscripted. My current problem with conscription in Iran is that it is only for men. I believe it should be for the whole society (the military isn't all about front lines and extremely hard physical conditions, there are many aspects that women can take control). Conscription allows all people in the society to appreciate peace (oppose war) and be with people outside their social class/bubble. Finally, it makes people physically/mentally stronger. Our peace is very fragile and the best way to stop populists/nationalists easily making threats of war is that everyone has a tangible (close family member) stake in it.

Science and Facebook: the same popularity law! (arXiv 1701.05347, Jan 19, 2017)

By comparing Facebook "share"s and citations in scientific journals, these authors conclude that "... the distribution of shares for the Facebook posts re-scale in the same manner to the very same curve with scientific citations. This finding suggests that citations are subjected to the same growth mechanism with Facebook popularity measures, being influenced by a statistically similar social environment and selection mechanism.". The effect of the social environment on the body of scientific knowledge was already well known, for example see Harwit (2009). But we like to think that as scientists we are more careful and behave more intelligently/critically in the progress of creating a more objective/scientific view of nature. If we are to accept the methods/results of this paper (and others like it that have been referenced in it), and hope that we can learn something about nature (and not merely expand our existing dogmas) the integrity of this method of rewarding scientific achievements (by citations) is greatly compromised.

Lack of quality discrimination in online information markets (arXiv 1701.02694, Jan 10, 2017)

The main focus of this paper is on social media, but it can be applied equally well to scientific literature and the behavior of the scientific community. For those scientific fields which arXiv does support, it has indeed been very useful in distributing the most recent papers and the number of papers arXiv publishes is constantly increasing. However, the downside is that every scientist only has a limited amount of attention to invest in each paper (or even to simply go over the titles/abstracts relevant to their work every day). Hence, the torrent of papers that come in every day and the information over-load they produce are not very dis-similar to the torrent of social media information the general public receive. Thus the results that they get might also be applicable to the progress of science. From the introduction: "This body of work suggests that, paradoxically, our behavioral mechanisms to cope with information overload may make online information markets less meritocratic and diverse".

Publication bias and the canonization of false facts (arXiv 1609.00494, Sep 2, 2016)

The definition of publication bias presented here (citing Sterling 1959) was very interesting for me: Publication bias arises when the probability that a scientific study is published is not independent of its result. They go onto to create a simple to understand model of how a claim (or hypothesis) goes onto be experimented and finally published. Then they look at how previously published results affect next ones as a Markov process: successive published result shifts the degree of belief, until sufficient evidence accumulates to accept the claim as fact or to reject it as false. The fact that most (nearly 80% of) published results are "positive" is then studied in that model to show how false results can easily be given a "fact" status in this system. It was a very thought-provoking and interesting article for me to read.

Do You Believe in God, or Is That a Software Glitch? (New York Times, Aug 27, 2016)

This is a nice example of how the requirement to publish more and faster, has pushed scientists to just use data and software without really understanding what it does or how its results should be interpretted. It is a review of Eklund et al. (2016) where the authors show that the false positive rate of the most common fMRI analysis tools is actually 70% (while it should be 5%). They are critical to the fact that "surprisingly its most common statistical methods have not been validated using real data". This New York Times review also nicely discusses how data and software (both the software and its configuration files) are increasingly not shared and how this undermines the core principle of science: reproducibility.

As a member of the astronomical community, I can say this problem is also strongly present in my field (observational extra-galactic astrophysics). Most researchers guard their software techniques like trade secrets! I have tried to tackle this wrong attitude with a fully reproducible approach to my own research. Please see the reproduction pipeline for my recent paper in the Astrophysical Journal Supplement Series as an example.

The age of post-truth politics (New York Times, Aug 24, 2016)

This article discusses how the torrent of raw data and easy access to it has changed our perspective on "facts". While it is primarily focused on politics, I believe the underlying problem it raises (we have more data and analysis methods then we can handle) is also present in scientific communities too.

The natural selection of bad science (arXiv 1605.09511, May 31, 2016)

The core argument of this article is "The persistence of poor methods results partly from incentives that favor them, leading to the natural selection of bad science." They provide a model on how to study this selection procedure and how it leads to the bad science. Andrew Gelman's review of the paper is also a nice related read followed by nice comments.

The pressure to publish pushes down quality (Nature, May 11, 2016, in PDF)

The title says it all. The argument and examples given by Daniel Sarewitz in this article is very interesting. The example of contaminated cancer studies in this paper and how many published results have been affected by it was a nice example.

A good quote from that paper: "So yes, the web makes it much more efficient to identify relevant published studies, but it also makes it that much easier to troll for supporting papers, whether or not they are any good. No wonder citation rates are going up."

Scientific notations for the digital era (arXiv 1605.02960, May 10, 2016)

Konrad Hinsen makes a very intriguing argument and proposal for the problem of "representation of scientific knowledge". The paper is very thought-provoking and makes some really interesting arguments against blindly using the traditional methods of conveying every-more complex scientific knowledge.

The growth of astrophysical understanding (IAU Proceedings, Jan 2009)

This is a very interesting article by Martin Harwit which discusses how astronomy and astrophysics are progressing in the modern era, here is one quote from part of the text: "Instead of competing against colleagues, you join them, often in the name of accomplishing more through joint rather than separate efforts. The joining of forces also translates into the increasing number of authors appearing on articles that, in earlier decades, might have been written by just one individual, or perhaps by a student and his professor. Hence the increasingly dense links between authors, and the emergence of the giant component of astronomers who publish jointly and think alike"

My Beef with big media (Washington Monthly, July/August 2004)

Ted Turner (founder of CNN, which was later bought by Time Warner) wrote this in 2004 about how media corporations became so huge and unified over the last few decades and the effect it has on removing different/new points of view. It is a very interesting read and I highly recommend it. Like the media, many other branches have also become highly centralized over the last few decades, including scientific publishing. Hence, they can suffer the same effects that Turner discusses in this article. Even though we have arXiv, the journal an article will be published in plays a critical role in the community reading it.