3 .edu



3. Several Valuable Suggestions

3.1 Postulates and Principles

 

Complex ideas may perhaps be well known by definition, which is nothing but an enumeration of those parts or simple ideas that comprise them. But when we have pushed up definitions to the most simple ideas, and find still some ambiguity and obscurity, what resources are we then possessed of?

                                                                                                                David Hume, 1748

 

As discussed in Section 1, even after stipulating the existence of coordinate systems with respect to which inertia is homogeneous and isotropic, there remains a fundamental amgibuity as to the character of the relationship between relatively moving inertial coordinate systems, corresponding to three classes of possible metrical structures, with the k values −1, 0, and +1. There is a remarkably close historical analogy for this situation, dating back to one of the first formal systems of thought ever proposed. In Book I of The Elements, Euclid consolidated and systematized plane geometry as it was known circa 300 BC into a formal deductive system. As it has come down to us, it is based on five postulates together with several definitions and common notions. (It’s worth noting, however, that the classifications of these premises was revised many times in various translations.) The first four of these postulates are stated very succinctly

 

1.  A straight line may be drawn from any point to any other point.

2.  A straight line segment can be uniquely and indefinitely extended.

3.  We may draw a circle of any radius about any point.

4.  All right angles are equal to one another.

 

Each of these assertions actually entails a fairly complicated set of premises and ambiguities, but they were accepted as unobjectionable for two thousand years. However, Euclid's final postulate was regarded with suspicion from earliest times. It has a very different appearance from the others - a difference that neither Euclid nor his subsequent editors and translators attempted to disguise. The fifth postulate is expressed as follows:

 

5.   If a straight line falling on two straight lines makes the [sum of the] interior angles on the same side less than two right angles, then the two straight lines, if produced indefinitely, meet on that side on which the angles are less than two right angles.

 

This postulate is equivalent to the statement that there's exactly one line through a given point P parallel to a given line L, as illustrated below

 

[pic]

 

Although this proposition is fairly plausible (albeit somewhat awkward to state), many people suspected that it might be logically deducible from the other postulates, axioms, and common notions. There were also attempts to substitute for Euclid's fifth postulate a simpler or more self-evident proposition. However, we now understand that Euclid's fifth postulate is logically independent of the rest of Euclid's logical structure. In fact, it's possible to develop logically consistent geometries in which Euclid's fifth postulate is false. For example, we can assume that there are infinitely many lines through P that are parallel to (i.e., never intersect) the line L. It might seem (at first) that it would be impossible to reason with such an assumption, that it would either lead to contradictions or else cause the system to degenerate into a logical triviality about which nothing interesting could be said, but, remarkably, this turns out not to be the case.

 

Suppose that although there are infinitely many lines through P that never intersect L, there are also infinitely many that do intersect L. This, combined with the other axioms and postulates of plane geometry, implies that there are two lines through P defining the boundary between lines that do intersect L and lines that don't, as shown below:

 

[pic]

 

This leads to the original non-Euclidean geometry of Lobachevski, Bolyai, and Gauss, i.e., the hyperbolic plane. The analogy to Minkowski spacetime is obvious. The behavior of “straight lines” in a surface of negative curvature (although positive-definite) is nicely suggestive of how the light-lines in spacetime serve as the dividing lines between those lines through P that intersect with the future "L" and those that don't (distinguishing between spacelike and timelike intervals). This is also a nice illustration of the fact that even though Minkowski spacetime is "flat" in the Riemannian sense, it is nevertheless distinctly non-Euclidean. Of course, the possibility that spacetime might be curved as well as locally Minkowskian led to general relativity, but arguably the conceptual leap required to go from a positive-definite to a non-positive-definite metric is greater than that required to go from a flat to a curved metric. The former implies that the local geometrical structure of the effective spatio-temporal manifold of events is profoundly different than had been assumed for thousands of years, and this realization led naturally to a new set of principles with which to organize and interpret our experience.

 

It became clear in the 19th century that there are actually three classes of geometries consistent with Euclid’s basic premises, depending on what we adopt as the “fifth postulate”. The three types of geometry correspond to spaces of negative, positive, or zero curvature. The analogy to the three possible classes of spacetimes (Euclidean, Galilean, and Minkowskian) is obvious, and in both cases it came to be recognized that, insofar as these mathematical structures were supposed to represent physical properties, the choice between the alternatives was a matter for empirical investigation.

 

Nevertheless, the superficially axiomatic way in which Einstein presented the special theory in his 1905 paper tended to encourage the idea that special relativity represented a closed formal system, like Euclid’s geometry interpreted in the purely mathematical sense. For example, in 1907 Paul Ehrenfest wrote that

 

In the formulation in which Mr Einstein published it, Lorentzian relativistic electrodynamics is rather generally viewed as a complete system. Accordingly it must be able to provide an answer purely deductively to the question [involving the shape of the moving electron]…

 

However, Einstein himself was quick to disavow this idea, answering

 

The principle of relativity, or, more exactly, the principle of relativity together with the principle of the constancy of the velocity of light, is not to be conceived as a “complete system,” in fact, not as a system at all, but merely as a heuristic principle which, when considered by itself, contains only statements about rigid bodies, clocks, and light signals. It is only by requiring relations between otherwise seemingly unrelated laws that the theory of relativity provides additional statements.

 

Just as the basic premises of Euclid’s geometry were classified in many different ways (e.g., postulates, axioms, common notions, definitions), the premises on which Einstein based special relativity can be classified in many different ways. Indeed, in his 1905 paper, Einstein introduced the first of these premises as follows:

 

... the same laws of electrodynamics and optics will be valid for all coordinate systems in which the equations of mechanics hold good. We will raise this conjecture (hereafter called the "principle of relativity") to the status of a postulate...

 

Here, in a single sentence, we find a proposition referred to as a conjecture, a principle, and a postulate. The meanings of these three terms are quite distinct, but they are each arguably applicable. The assertion of the co-relativity of optics and mechanics was, and will always be, conjectural, because it can be empirically corroborated only up to a limited precision. Einstein formally adopted this conjecture as a postulate, but on a more fundamental level it serves as a principle, since it entails the decision to organize our knowledge in terms of coordinate systems with respect to which the equations of mechanics hold good, i.e., inertial coordinate systems. Einstein goes on to introduce a second proposition that he formally adopts as a postulate, namely,

 

... that the velocity of light always propagates in empty space with a definite velocity c that is independent of the state of motion of the emitting body. These two postulates suffice for the attainment of a simple and consistent electrodynamics of moving bodies based on Maxwell's theory for bodies at rest.

 

Interestingly, in the paper "Does the Inertia of a Body Depend on Its Energy Content?" published later in the same year, Einstein commented that

 

... the principle of the constancy of the velocity of light... is of course contained in Maxwell's equations.

 

In view of this, some have wondered why he did not simply dispense with his "second postulate” and assert that the "laws of electrodynamics and optics" in the statement of the first principle are none other than Maxwell's equations. In other words, why didn’t he simply base his theory on the single proposition that Maxwell's equations are valid for every system of coordinates in which the laws of mechanics hold good? Part of the answer is that he realized important parts of physics, such as the physics of elementary particles, cannot possibly be explained in terms of Maxwellian electrodynamics. In a note published in 1907 he wrote

 

It should be noted that the laws that govern [the structure of the electron] cannot be derived from electrodynamics alone. After all, this structure necessarily results from the introduction of forces which balance the electrodynamic ones.

 

More fundamentally, by 1905 he was already aware of the fact that, although Maxwell's equations are empirically satisfactory in many respects, they cannot be regarded as fundamentally correct or valid. In his paper "On a Heuristic Point of View Concerning the Production and Transformation of Light" he wrote

 

... despite the complete confirmation of [Maxwell's theory] by experiment, the theory of light, operating with continuous spatial functions, leads to contradictions when applied to the phenomena of emission and transformation of light.

 

Thus it isn't surprising that he chose not to base the theory of relativity on Maxwell’s equations. He needed to distill from electromagnetic phenomena the key feature whose significance "transcended its connection with Maxwell's equations", and which would serve as a viable principle for organizing our knowledge of all phenomena, including both optics and mechanics. The principle he selected was the existence of an invariant speed with respect to any local system of inertial coordinates, and then for definiteness he could identify this speed with the speed of propagation of electromagnetic energy.

 

After reviewing the operational definition of inertial coordinates in section §1 (which he does by optical rather than mechanical means, thereby missing an opportunity to clarify the significance of inertial coordinates in establishing the connection between mechanical and optical phenomena), he gives more formal statements of his two principles

 

The following reflections are based on the principle of relativity and the principle of the constancy of the velocity of light. These two principles we define as follows:

 

1.  The laws by which the states of physical systems undergo change are not affected, whether these changes of state be referred to the one or the other of two systems of co-ordinates in uniform translatory motion.

2.  Any ray of light moves in the "stationary" system of co-ordinates with the determined velocity c, whether the ray is emitted by a stationary or by a moving body. Hence velocity equals [length of] light path divided by time interval [of light path], where time interval [and length are] to be taken in the sense of the definition in §1.

 

The first of these is nothing but the principle of inertial relativity, which had been accepted as a fundamental principle of physics since the time of Galileo (see section 1.3). Strictly speaking, Einstein’s statement of the principle here is incorrect, because he assumes the coordinate systems in which the equations of mechanics hold good are fully characterized by being in uniform translatory motion, whereas in fact it is also necessary to specify an inertially isotropic simultaneity. Einstein chose to address this aspect of inertial coordinate systems by means of a separate and seemingly arbitrary definition of simultaneity based on optical phenomena, which unfortunately has invited much misguided philosophical debate about what should be considered “true” simultaneity. All this could have been avoided if, from the start, Einstein had merely stated that an inertial coordinate system is one in which mechanical inertia is homogeneous and isotropic (just as Galileo said), and then noting that automatically entails the conventional choice of simultaneity. The content of his first principle (i.e., the relativity principle) is simply that the inertial simultaneity of mechanics and the optical simultaneity of electrodynamics are identical.

 

Despite the shortcomings of its statement, the principle of relativity was very familiar to the physicists of 1905, whether they wholeheartedly accepted it or not. Einstein's second principle, by itself, was also not regarded as particularly novel, because it conveys the usual understanding of how a wave propagates at a fixed speed through a medium, independent of the speed of the source. It was the combination of these two principles that was new, since they had previously been thought to be irreconcilable. In a sense, the first principle arose from the “ballistic particles in a vacuum” view of physics, and the second arose from the “wave in a material medium” view of physics. Both of these views can trace their origins back to ancient times, and both seem to capture some fundamental truth about the world, and yet they had always been regarded as mutually exclusive. Einstein’s achievement was to explain how they could be reconciled.

 

Of course, Einstein’s second principle it isn't a self-contained statement, because its entire meaning and significance depends on "the sense of" time intervals and (implicitly) spatial lengths given in §1, where we find that time intervals and spatial lengths are defined to be such that their ratio equals the fixed constant c for light paths. This has tempted some readers to conclude that "Einstein's second principle" was merely a tautology, with no substantial content. The source of this confusion is the fact that the essential axiomatic foundations underlying special relativity are contained not in the two famous propositions at the beginning of §2 of Einstein's paper (as quoted above), but rather in the sequence of assumptions and definitions explicitly spelled out in §1. Among these are the very first statement

 

Let us take a system of co-ordinates in which the equations of Newtonian mechanics hold good.

 

In subsequent re-prints of this paper Sommerfeld added a footnote to this statement, to say "i.e., to the first approximation", meaning for motion with speeds small in comparison with the speed of light. (This illustrates the difficulty of writing a paper that results in a modification of the equations of Newtonian mechanics!) Of course, Einstein was aware of the epistemological shortcomings of the above statement, because while it tells us to begin with an inertial system of coordinates, it doesn't tell us how to identify such a system. This has always been a potential source of ambiguity for mechanics based on the principle of inertia. Strictly speaking, Newton's laws are epistemologically circular, so in practice we must apply it both inductively and deductively. First we use them inductively with our primitive observations to identify inertial coordinate systems by observing how things behave. Then at some point when we've gained confidence in the inertialness of our coordinates, we begin to apply the laws deductively, i.e., we begin to deduce how things will behave with respect to our inertial coordinates. Ultimately this is how all physical theories are applied, first inductively as an organizing principle for our observations, and then deductively as "laws" to make predictions. Neither Galilean nor special relativity is able to justify the privileged role given to a particular class of coordinate systems, nor to provide a non-circular means of identifying those systems. In practice we identify inertial systems by means of an incomplete induction. Although Einstein was aware of the deficiency of this approach (which he subsequently labored to eliminate from the general theory), in 1905 he judged it to be the only pragmatic way forward.

 

The next fundamental assertion in §1 of Einstein's paper is that lengths and time intervals can be measured by (and expressed in terms of) a set of primitive elements called "measuring rods" and "clocks". As discussed in Section 1.2, Einstein was fully aware of the weakness in this approach, noting that “strictly speaking, measuring rods and clocks should emerge as solutions of the basic equations”, not as primitive conceptions. Nevertheless

 

it was better to admit such inconsistency - with the obligation, however, of eliminating it at a later stage of the theory...

 

Thus the introduction of clocks and rulers as primitive entities was another pragmatic concession, and one that Einstein realized was not strictly justifiable on any other grounds than provisional expediency.

 

Next Einstein acknowledges that we could content ourselves to time events by using an observer located at the origin of the coordinate system, which corresponds to the absolute time of Lorentz, as discussed in Section 1.6. Following this he describes the "much more practical arrangement" based on the reciprocal operational definition of simultaneity. He says

 

We assume this definition of synchronization to be free of any possible contradictions, applicable to arbitrarily many points, and that the following relations are universally valid:

1. If the clock at B synchronizes with the clock at A, the clock at A synchronizes with the clock at B.

2.  If the clock at A synchronizes with the clock at B and also with the clock at C, the clocks at B and C also synchronize with each other.

 

These are important and non-trivial assumptions about the viability of the proposed operational procedure for synchronizing clocks, but they are only indirectly invoked by the reference to "the sense of time intervals" in the statement of Einstein's second principle. Furthermore, as mentioned in Section 1.6, Einstein himself subsequently identified at least three more assumptions (homogeneity, spatial isotropy, memorylessness) that are tacitly invoked in the formal development of special relativity. The list of unstated assumptions would actually be even longer if we were to construct a theory beginning from nothing but an individual's primitive sense perceptions. The justification for leaving them out of a scientific paper is that these can mostly be classified as what Euclid called "common notions", i.e., axioms that are common to all fields of thought.

 

In many respects Einstein modeled his presentation of special relativity not on Euclid’s Elements (as Newton had done in the Principia), but on the formal theory of thermodynamics, which is founded on the principle of the conservation of energy. There are different kinds of energy, with formally different units, e.g., mechanical and gravitational potential energy are typically measured in terms of joules (a force times a distance, or equivalently a mass times a squared velocity), whereas heat energy is measured in calories (the amount of heat required to raise the temperature of 1 gram of water by one degree C). It's far from obvious that these two things can be treated as different aspects of the same thing, i.e., energy. However, through careful experiments and observations we find that whenever mechanical energy is dissipated by friction (or any other dissipative process), the amount of heat produced is proportional to the amount of mechanical energy dissipated. Conversely, whenever heat is involved in a process that yields mechanical work, the heat content is reduced in proportion to the amount of work produced. In both cases the constant of proportionality is found to be 4.1833 joules per calorie.

 

Now, the First Law of thermodynamics asserts that the total energy of any physical process is always conserved, provided we "correctly" account for everything. Of course, in order for this assertion to even make sense we need to define the proportionality constants between different kinds of energy, and those constants are naturally defined so as to make the First Law true. In other words, we determine the proportionality between heat and mechanical work by observing these quantities and assuming that those two changes represent equal quantities of something called "energy". But this assumption is essentially equivalent to the First Law, so if we apply these operational definitions and constants of proportionality, the conservation of energy can be regarded as a tautology or a convention.

 

This shows clearly that, just as in the case of Newton's laws, these propositions are actually principles rather than postulates, meaning that they first serve as organizing principles for our measurements and observations, and only subsequently do they serve as "laws" from which we may deduce further consequences. This is the sense in which fundamental physical principles always operate. Wein's letter of 1912 nominating Einstein and Lorentz for the Nobel prize commented on this same point, saying that "the confirmation of [special relativity] by experiment... resembles the experimental confirmation of the conservation of energy".

 

Einstein himself acknowledged that he consciously modeled the formal structure of special relativity on thermodynamics. He wrote in his autobiographical notes

 

Gradually I despaired of the possibility of discovering the true laws by means of constructive efforts based on known facts. The longer and the more desperately I tried, the more I came to the conviction that only the discovery of a universal formal principle could lead us to assured results. The example I saw before me was thermodynamics. The general principle was there given in the proposition: The laws of nature are such that it is impossible to construct a perpetuum mobile (of the first and second kinds).

 

This principle is a meta-law, i.e., it does not express a particular law of nature, but rather a general principle to which all the laws of nature conform. In 1907 Ehrenfest suggested that special relativity constituted a closed axiomatic system, but Einstein quickly replied that this was not the case. He explained that the relativity principle combined with the principle of invariant light speed is not a closed system at all, but rather it provides a coherent framework within which to conduct physical investigations. As he put it, the principles of special relativity "permit certain laws to be traced back to one another (like the second law of thermodynamics)."

 

Not only is there a close formal similarity between the axiomatic structures of thermodynamics and special relativity, each based on two fundamental principles, these two theories are also substantively extensions of each other. The first law of thermodynamics can be placed in correspondence with the basic principle of relativity, which suggests the famous relation E = mc2, thereby enlarging the realm of applicability of the first law. The second law of thermodynamics, like Einstein's second principle of invariant light speed, is more sophisticated and more subtle. A physical process whose net effect is to remove heat from a body and produce an equivalent amount of work is called perpetual motion of the second kind. It isn't obvious from the first law that such a process is impossible, and indeed there were many attempts to find such a process - just as there were attempts to identify the rest frame of the electromagnetic ether - but all such attempts failed. Moreover, they failed in such a way as to make it clear that the failures were not accidental, but that a fundamental principle was involved.

 

In the case of thermodynamics this was ultimately formulated as the second law, one statement of which (as alluded to by Einstein in the quote above) is simply that perpetual motion of the second kind is impossible - provided the various kinds of energy are defined and measured in the prescribed way. (This theory was Einstein's bread and butter, not only because most of his scientific work prior to 1905 had been in the field of thermodynamics, but also because a patent examiner inevitably is called upon to apply the first and second laws to the analysis of hopeful patent applications.) Compare this with Einstein's second principle, which essentially asserts that it's impossible to measure a speed in excess of the constant c - provided the space and time intervals are defined and measured in the prescribed way. The strength of both principles is due ultimately to the consistency and coherence of the ways in which they propose to analyze the processes of nature.

 

Needless to say, our physical principles are not arbitrarily selected assumptions, they are hard-won distillations of a wide range of empirical facts. Regarding the justification for the principles on which Einstein based special relativity, many popular accounts give a prominent place to the famous experiments of Michelson and Morley, especially the crucial version performed in 1889, often presenting this as the "brute fact" that precipitated relativity. Why, then, does Einstein’s 1905 paper fail to cite this famous experiment? It does mention at one point “the various unsuccessful attempts to measure the Earth’s motion with respect to the ether”, but never refers to Michelson's results specifically. The conspicuous absence of any reference to this important experimental result has puzzled biographers and historians of science. Clearly Einstein’s intent was to present the most persuasive possible case for the relativity of space and time, and Michelson's results would (it seems) have been a very strong piece of evidence in his favor. Could he simply have been unaware of the experiment at the time of writing the paper?

 

Einstein’s own recollections on this point were not entirely consistent. He sometimes said he couldn’t remember if he had been aware in 1905 of Michelson's experiments, but at other times he acknowledged that he had known of it from having read the works of Lorentz. Indeed, considering Einstein’s obvious familiarity with Lorentz’s works, and given all the attention that Lorentz paid to Michelson’s ether drift experiments over the years, it’s difficult to imagine that Einstein never absorbed any reference to those experiments. Assuming he was aware of Michelson's results prior to 1905, why did he chose not to cite them in support of his second principle? Of course, his paper includes no formal “references” at all (which in itself seems peculiar, especially to modern readers accustomed to extensive citations in scholarly works), but it does refer to some other experiments and theories by name, so an explicit reference to Michelson’s result would not have been out of place.

 

One possible explanation for Einstein’s reluctance to cite Michelson, both in 1905 and subsequently, is that he was sophisticated enough to know that his “theory” was technically just a re-interpretation of Lorentz’s theory - making identical predictions - so it could not be preferred on the basis of agreement with experiment. To Einstein the most important quality of his interpretation was not its consistency with experiment, but it’s inherent philosophical soundness. In other words, conflict with experiment was bad, but agreement with experiment by means of ad hoc assumptions was hardly any better. His critique of Lorentz’s theory (or what he knew of it at the time) was not so much that it was empirically "wrong" (which it wasn’t), but that the length contraction and time dilation effects had been inserted ad hoc to match the null results Michelson. (It’s debatable whether this critique was justified, in view of the discussion in Section 1.5.) Therefore, Einstein would naturally have been concerned to avoid giving the impression that his relativistic theory had been contrived specifically to conform with Michelson’s results. He may well have realized that any appeal to the Michelson-Morley experiment in order to justify his theory would diminish rather than enhance its persuasiveness.

 

This is not to suggest that Einstein was being disingenuous, because it’s clear that the principles of special relativity actually do emerge very naturally from just the first-order effects of magnetic induction (for example), and even from more basic considerations of the mathematical intelligibility of Galilean versus Lorentzian transformations (as stressed by Minkowski in his famous 1908 lecture). It seems clear that Einstein’s explanations for how he arrived at special relativity were sincere expressions of his beliefs about the origins of special relativity in his own mind. He was focused on the phenomenon of magnetic induction and the unphysical asymmetry of the pre-relativistic explanations. This was combined with a strong instinctive belief in the complete relativity of physics. He told Shankland in 1950 that the experimental results which had influenced him the most were stellar aberration and Fizeau's measurements on the speed of light in moving water. "They were enough," he said.

 

3.2  Natural and Violent Motions

 

Mr Spenser in the course of his remarks regretted that so many members of the Section were in the habit of employing the word Force in a sense too limited and definite to be of any use in a complete theory.  He had himself always been careful to preserve that largeness of meaning which was too often lost sight of in elementary works.  This was best done by using the word sometimes in one sense and sometimes in another, and in this way he trusted that he had made the word occupy a sufficiently large field of thought.

                                                                                                       James Clerk Maxwell

 

The concept of force is one of the most peculiar in all of physics. It is, in one sense, the most viscerally immediate concept in classical mechanics, and seems to serve as the essential "agent of causality" in all interactions, and yet the ontological status of force has always been highly suspect. We sometimes regard force as the cause of changes in motion, and imagine that those changes would not occur in the absence of the forces, but this causative aspect of force is an independent assumption that does not follow from any quantifiable definition, since we could equally well regard force as being caused by changes in motion, or even as merely a descriptive parameter with no independent ontological standing at all.

 

In addition, there is an inherent ambiguity in the idea of changes in motion, because it isn't obvious what constitutes unchanging (i.e., unforced) motion. Aristotle believed it was necessary to distinguish between two fundamentally distinct kinds of motion, which he called natural motions and violent motions. The natural motions included the apparent movements of celestial objects, the falling of leaves to the ground, the upward movement of flames and hot gases in the atmosphere, or of air bubbles in water, and so on. According to Aristotle, the cause of such motions is that all objects and substances have a natural place or level (such as air above, water below), and they proceed in the most direct way, along straight vertical paths, to their natural places. The motion of the celestial bodies is circular because this is the most perfect kind of unchanging eternal motion, whereas the necessarily transitory motions of sublunary objects are rectilinear. It may not be too misleading to characterize Aristotle's concept of sublunary motion as a theory of buoyancy, since the natural place of light elements is above, and the natural place of heavy elements is below. If an object is out of place, it naturally moves up or down as appropriate to reach its proper place.

 

Aristotle has often been criticized for saying (or seeming to say) that the speed at which an object falls (through the air) is proportional to its weight. To the modern reader this seems absurd, as it is contradicted by the simplest observations of falling objects. However, it's conceivable that we misinterpret Aristotle's meaning, partly because we're so accustomed to regarding the concept of force as the cause of motion, rather than as an effect or concomitant attribute of motion. If we consider the downward force (which Aristotle would call the weight) of an object to be the force that would be required to keep it at its current height, then the "weight" of an object really is substantially greater the faster it falls. More strength is required to catch a falling object than to hold the same object at rest. Some Aristotelian scholars have speculated that this was Aristotle's actual meaning, although his writing's on the subject are so sketchy that we can't know for certain. In any case, it illustrates that the concept and significance of force in a physical theory is often murky, and it also shows how thoroughly our understanding of physical phenomena is shaped by the distinction between forces (such as gravity) that we consider to be causes of motion, and those (such as impact forces) that we consider to be caused by motion.

 

Aristotle also held that the speed of motion was not only proportional to the "weight" (whatever that means) but inversely proportional to the resistance of the medium. Thus his proposed law of motion could be expressed roughly as V = W/R, and he used this to argue against the possibility of empty space, i.e., regions in which R = 0, because the velocity of any object in such a region would be infinite. This doesn't seem like a very compelling argument, since we could easily counter that the putative vacuum would not be the natural place of any object, so it would have no "weight" in that direction either. Nevertheless, perhaps to avoid wrestling with the mysterious fraction 0/0, Aristotle surrounded the four sublunary elements of Earth, Water, Air, and Fire with a fifth element (quintessence), the lightest of all, called aether. This aether filled the super-lunary region, ensuring that we would never need to divide by zero.

 

In addition to natural motions, Aristotle also considered violent motions, which were any motions resulting from acts of volition of living beings. Although his writings are somewhat obscure and inconsistent in this area, it seems that he believed such beings were capable of self-motion, i.e., of initiating motion in the first instance, without having been compelled to motion by some external agent. Such self-movers are capable of inducing composite motions in other objects, such as when we skip a stone on the surface of a pond. The stone's motion is compounded of a violent component imparted by our hand, and the natural component of motion compelling it toward its natural place (below the air and water). However, as always, we must be careful not to assume that this motion is to be interpreted as the causative result of the composition of two different kinds of forces. It was, for Aristotle, simply the kinematic composition of two different kinds of motion.

 

The bifurcation of motion into two fundamentally different types, one for natural motions of non-living objects and another for acts of human volition – and the attention that Aristotle gave to the question of unmoved movers, etc. – is obviously related to the issue of free will, and demonstrates the strong tendency of scientists in all ages to exempt human behavior from the natural laws of physics, and to regard motions resulting from human actions as original, in the sense that they need not be attributed to other motions.  We'll see in Section 9 that Aristotle's distinction between natural and violent motions plays a key role in the analysis of certain puzzling aspects of quantum theory.

 

We can also see that the ontological status of "force" in Aristotle's physics is ambiguous. In some circumstances it seems to be more an attribute of motion rather than a cause of motion. Even if we consider the quantitative physics of Galileo, Newton, and beyond, it remains true that "force", while playing a central role in the formulation, serves mainly as an intermediate quantity in the calculations. In fact, the concept of 'force' could almost be eliminated entirely from classical mechanics. (See section 4 for further discussion of this.) Newton wrestled with the question of whether force should be regarded as an observable or simply a relation between observables. Interestingly, Ernst Mach regarded the third law as Newton's most important contribution to mechanics, even though other's have criticized it as being more a definition than a law.

 

Newton’s struggle to find the "right" axiomatization of mechanics can be seen by reading the preliminary works he wrote leading up to The Principia, such as "De motu corporum in gyrum" (On the motion of bodies in an orbit). At one point he conceived of a system with five Laws of Motion, but what finally appeared in Principia were eight Definitions followed by three Laws. He defined the "quantity of matter" as the measure arising conjointly from the density and the volume. In his critical review of Newtonian mechanics, Mach remarked that this definition is patently circular, noting that "density" is nothing but the quantity of matter per volume. However, all definitions ultimately rely on undefined (irreducible) terms, so perhaps Newton was entitled to take density and volume as two such elements of his axiomatization. Furthermore, by basing the quantity of matter on explicitly finite density and volume, Newton deftly precluded point-like objects with finite quantities of matter, which would imply the existence of infinite forces and infinite potential energy according to his proposed inverse-square law of gravity. 

 

The next basic definition in Principia is of the "quantity of motion", defined as the measure arising conjointly from the velocity and the quantity of matter. Here we see that "velocity" is taken as another irreducible element, like density and volume. Thus, Newton's ontology consists of one irreducible entity, called matter, possessing three primitive attributes, called density, volume, and velocity, and in these terms he defines two secondary attributes, the "quantity of matter" (which we call "mass") as the product of density and volume, and the "quantity of motion" (which we call "momentum") as the product of velocity and mass, meaning it is the product of velocity, density, and volume. Although the term "quantity of motion" suggests a scalar, we know that velocity is a vector, (i.e., it has a magnitude and a direction), so it's clear that momentum as Newton defined it is also is a vector. After going on to define various kinds of forces and the attributes of those forces, Newton then, as we saw in Section 1.3, took the law of inertia and relativity as his First Law of Motion, just as Descartes and Huygens had done. Following this we have the "force law", i.e., Newton's Second Law of Motion:

 

The change of motion is proportional to the motive force impressed; and is made in the direction of the right line in which the force is impressed.

 

Notice that this statement doesn't agree precisely with either of the two forms in which the Second Law is commonly given today, namely, as F = dp/dt or F = ma. The former is perhaps closer to Newton's actual statement, since he expressed the law in terms of momentum rather than acceleration, but he didn't refer to the rate of change of momentum. No time parameter appears in the statement at all. This is symptomatic of a lack of clarity (as in Aristotle’s writings) over the distinction between "impulse force" and "continuous force". Recall that our speculative interpretation of Aristotle's downward "weight" was based on the idea that he actually had in mind something like the impulse force that would be exerted by the object if it were abruptly brought to a halt. Newton's Second Law, as expressed in the Principia, seems to refer to such an impulse, and this is how Newton used it in the first few Propositions, but he soon began to invoke the Second Law with respect to continuous forces of finite magnitude applied over a finite length of time – more in keeping with a continuous force of gravity, for example. This shows that even in the final version of the axioms and definitions laid down by Newton, he did not completely succeed in clearly delineating the concept of force that he had in mind. Of course, in each of his applications of the Second Law, Newton made the necessary dimensional adjustments to appropriately account for the temporal aspect that was missing from the statement of the Law itself, but this was done ad hoc, with no clear explanation. (His ability to reliably incorporate these factors in each context testifies to his solid grasp of the new dynamics, despite the imperfections of his formal articulation of it.) Subsequent physicists clarified the quantitative meaning of Newton’s second law, explicitly recognizing the significance of time, by expressing the law either in the form F = d(mv)/dt or else in what they thought was the equivalent form F = m(dv/dt). Of course, in the context of special relativity these two are not equivalent, and only the former leads to a coherent formulation of mechanics. (It’s also worth noting that, in the context of special relativity, the concept of force is largely an anachronism, and it is introduced mainly for the purpose of relating relativistic descriptions to their classical counterparts.)

 

The third Law of Motion in the Principia is regarded by many people as one of Newton's greatest and most original contributions to physics. This law states that

 

To every action there is always opposed an equal reaction: or, the mutual actions of two bodies upon each other are always equal, and directed to contrary parts.

 

Unfortunately the word "action" is not found among the previously defined terms, but in the subsequent text Newton clarifies the intended meaning. He says "If a body impinge upon another, and by its force change the motion of the other, that body also... will undergo an equal change in its own motion towards the contrary part." In other words, the net change in the "quantity of motion" (i.e., the sum of the momentum vectors) is zero, so momentum is conserved. More subtly, Newton observes that "If a horse draws a stone tied to a rope, the horse will be equally drawn back towards the stone". This is true even if neither the horse nor the stone are moving (which of course implies that they are each subject to other forces as well, tending to hold them in place). The illustrates how the concept of force enables us to conceptually decompose a null net force into non-null components, each representing the contributions of different physical interactions.

 

In retrospect we can see that Newton's three "laws of motion" actually represent the definition of an inertial coordinate system. For example, the first law imposes the requirement that the spatial coordinates of any material object free of external forces are linear functions of the time coordinate, which is to say, free objects move with a uniform speed in a straight line with respect to an inertial coordinate system. Rather than seeing this as a law governing the motions of free objects with respect to a given system of coordinates, it is more correct to regard it as defining a class of coordinates systems in terms of which a recognizable class of motions have particularly simple descriptions. It is then an empirical question as to whether the phenomena of nature possess the attributes necessary for such coordinate systems to exist.

 

The significance of “force” was already obscure in Newton’s three laws of mechanics, but it became even more obscure when he proposed the law of universal gravitation, according to which every particle of matter exerts a force of attraction on every other particle of matter, with a strength proportional to its mass and inversely proportional to the square of the distance. The rival Cartesians expected all forces to be the result of local contact between bodies, as when two objects press directly against each other, but Newton’s conception of instantaneous gravity between distant objects seems to defy representation in those terms. In an effort to reconcile universal gravitation with semi-Cartesian ideas of force, Newton’s young friend Nicolas Fatio hypothesized an omni-directional flux of small “ultra-mundane” particles, and argued that the mutual shadowing effect could explain why massive bodies are forced together. The same idea was later taken up by Lesage, but many inconsistencies were pointed out, making it clear that no such theory could accurately account for the phenomena. The simple notion of force at a distance was so successful that it became the model for all mutual forces between objects, and the early theories of electricity and magnetism were expressed in those terms. However, reservations about the intelligibility of instantaneous action at a distance remained. Eventually Faraday and Maxwell introduced the concept of disembodied “lines of force”, which later came to be regarded as fields of force, almost as if force was an entity in its own right, capable of flowing from place to place. In this way the Maxwellians (perhaps inadvertently) restored the Cartesian ideas that all space must be occupied and that all forces must be due to direct local contact. They accomplished this by positing a new class of entity, namely the field. Admittedly our knowledge of the electromagnetic field is only inferred from the behavior of matter, but it was argued that explanations in terms of fields are more intelligible than explanations in terms of instantaneous forces at a distance, mainly because fields were considered necessary for strict conservation of energy and momentum once it was recognized that electromagnetic effects propagate at a finite speed.

 

However, the explanation of phenomena in terms of fields, characterized by partial differential equations, was incomplete, because it was not possible to represent stable configurations of matter in these terms. Maxwell’s field equations are linear, so there was no hope of them possessing solutions corresponding to discrete electrical charges or particles of matter. Hence it was still necessary to retain the laws of mechanics of discrete entities, characterized by total differential equations. The conceptual dichotomy between Newton’s physics of particles and Maxwell’s physics of fields is clearly shown by the contrast between total and partial differential equations, and this contrast was seen (by some people at least) as evidence of a fundamental flaw. In a 1936 retrospective essay Einstein wrote

 

This is the basis on which H. A. Lorentz obtained his synthesis of Newton’s mechanics and Maxwell’s field theory. The weakness of this theory lies in the fact that it tried to determine the phenomena by a combination of partial differential equations (Maxwell’s field equations for empty space) and total differential equations (equations of motions of points), which procedure was obviously unnatural.

 

The difference between total and partial differential equations is actually more profound than it may appear at first glance, because (as alluded to in section 1.1) it entails different assumptions about the existence of free-will and acts of volition. If we consider a point-like particle whose spatial position x(t) is strictly a function of time, and we likewise consider the forces F(t) to which this particle is subjected as strictly a function of time, then the behavior of this particle can be expressed in the form of total differential equations, because there is just a single independent variable, namely the time coordinate. Every physically meaningful variable exists as one of a countable number of explicit functions of time, and each of the values is realized at it’s respective time. Thus the total derivatives are evaluated over actualized values of the variables. In contrast, the partial derivatives over immaterial fields are inherently hypothetical, because they represent the variations in some variable of a particle not as a function of time along the particle’s actual path, but transversely to the particle’s path. For example, rather than asking how the force experienced by a particle changes over time, we ask how the force would change if at this instant of time the particle was in a slightly different position. Such hypotheticals have meaning only assuming an element of contingency in events, i.e., only if we assume the paths of material objects could be different than they are.

 

Of course, if we were to postulate a substantial continuous field, we could have non-hypothetical partial derivatives, which would simply express the facts implicit in the total derivatives for each substantial part of the field. However, the intelligibility of a truly continuous extended substance is questionable, and we know of no examples of such a thing in nature. Given that the elementary force fields envisaged by the Maxwellians were eventually concede to be immaterial, and their properties could only be inferred from the state variables of material entities, it’s clear that the partial derivatives over the field variables are not only hypothetical, but entail the assumption of freedom of action. In the absence of freedom, any hypothetical transverse variations in a field (i.e., transverse to the actual paths of material entities) would be meaningless. Only actual variations in the state variables of material entities would have meaning. Thus the contrast between total and partial differential equations reflects two fundamentally different conceptual frameworks, the former based on determinism and the latter based on the possibility of free acts. This is closely analogous to Aristotle’s dichotomy between natural and violent motions.

 

As noted above, Einstein regarded this dualism as unnatural, and his intuition led him to expect that the field concept, governed by partial differential equations, would ultimately prove to be sufficient for a complete description of phenomena. In the same essay mentioned above he wrote

 

What appears certain to me, however, is that, in the foundations of any consistent field theory, there should not be, in addition to the concept of the field, any concept concerning particles. The whole theory must be based solely on partial differential equations and their singularity-free solutions.

 

It may seem ironic that he took this view, considering that Einstein was such a staunch defender of strict causality and determinism, but by this time he was wholly committed to the concept of a continuous field as the ultimate ontological entity, more fundamental even than matter, and possessing a kind of relativistic substantiality, subject to deterministic laws. In a sense, he seems to have come to believe that the field was not a hypothetical entity inferred from the observed behavior of material bodies, but rather that material bodies were hypothetical entities inferred from the observed behavior of fields. An important first step in this program was to eliminate the concept of forces acting between bodies, and to replace this with a field-theoretic model. He (arguably) accomplished this for gravitation with the general theory of relativity, which completely dispenses with the concept of a "force of gravity", and instead interprets objects under the influence of gravity as simply proceeding, unforced, along the most natural (geodesic) paths. Thus the concept of force, and particularly gravitational force, which was so central to Newton's synthesis, was simply discarded as having no absolute significance.

 

However, the concept of force is still very important in physics, partly because we continue to employ the classical formulation of mechanics in the limit of low speeds and weak gravity, but more importantly because it has not proven possible (despite the best efforts of Einstein and others) to do for the other forces of nature what general relativity did for gravity, i.e., to express the apparently forced (violent) motions as natural paths through a modified geometry of space and time. 

 

3.3  De Mora Luminis

 

I see my light come shining,

From the west down to the east.

Any day now, any day now,

I shall be released.

                      Bob Dylan, 1967

 

We are usually not aware of any delay between the occurrence of an event and its visual appearance in the eye of a distant observer. In fact, a single visual "snapshot" is probably the basis for most people's intuitive notion of an "instant". However, the causal direction of an instantaneous interaction is inherently ambiguous, so it's perhaps not surprising that ancient scholars considered two competing models of vision, one based on the idea that every object is the source of images of itself, emanating outwards to the eye of the observer, and the other claiming that the observer's eye is the source of visual rays emanating outwards to "feel" distant objects. An interesting synthesis of these two concepts is the idea, adopted by Descartes, of light as a kind of pressure in an ideal incompressible medium that conveys forces and pressures instantaneously from one location to another. However, even with Descartes we find the medium described as "incompressible, or nearly incompressible", revealing the difficulty of reconciling instantaneous force at a distance with our intuitive idea of causality. Fermat raised this very objection when he noted (in a letter on Descartes' Dipotrics) that if we assume instantaneous transmission of light we are hardly justified in analyzing such transmissions by means of analogies with motion through time.

 

Perhaps urged by the sense that any causal action imposed from one location on another must involve a progression in time, many people throughout history have speculated that light may propagate at a finite speed, but all efforts to discern a delay in the passage of light (mora luminus) failed. One of the earliest such attempts of which we have a written account is the experiment proposed by Galileo, who suggested (in his Dialogue Concerning Two New Sciences) relaying a signal back and forth with lamps and shutters located on separate hilltops. Based on the negative results from this type of crude experiment, Galileo could only confirm what everyone already knew, namely, that the propagation of light is "if not instantaneous, then extraordinarily fast". He went on to suggest that it might be possible to discern, in distant clouds, some propagation time for the light emitted by a lightning flash.

 

We see the beginning of this light – I might say its head and source – located at a particular place among the clouds; but it immediately spreads to the surrounding ones, which seems to be an argument that at least some time is required for propagation. For if the illumination were instantaneous and not gradual, we should not be able to distinguish its origin – its center, so to speak – from its outlying portions.

 

The idea of using the clouds in the night sky as a giant bubble chamber was characteristic of Galileo’s talent for identifying opportunities in natural phenomena for testing ideas, as well as his attentiveness to subtle qualitative impressions, such as the sense of being able to distinguish the “center” of the illumination given off by a flash of lightning, even though we can’t quantify the delay time. It also shows that Galileo was inclined to think light propagated at a finite speed, but of course he rightly qualified this lightning-cloud argument by admitting that “really these matters lie far beyond our grasp”. Today we would say the perceived “spreading out” of a lightning strike through the clouds is due to propagation of the electrical discharge process. Even for clouds located ten miles apart, the time for light itself to propagate from one cloud to the other is only one 18,600th of a second, presumably much too short to give any impression of delay to human senses.

 

Interestingly, Galileo also contributed (posthumously) to the first successful attempt to actually observe a delay attributable to the propagation of light at a finite speed. In 1610, soon after the invention of the telescope, he discovered the four largest moons of Jupiter, illustrated below:

 

[pic]

 

In hopes of gaining the patronage of the Grand Duke Cosimo II, Galileo named Jupiter's four largest moons the "Medicean Stars", but today they're more commonly called the Galilean satellites. At their brightest these moons would be just bright enough (with magnitudes between 5 and 6) to be visible from Earth with the naked eye - except that they are normally obscured by the brightness of Jupiter itself. (Interestingly, there is some controversial evidence suggesting that an ancient Chinese astronomer may actually have glimpsed one of these moons 2000 years before Galileo.) Of course, from our vantage point on Earth, we must view the Jupiter system edgewise, so the moons appear as small stars that oscillate from side to side along the equatorial plane of Jupiter. If they were all perpendicular to the Earth's line of sight, and all on the same side of Jupiter, simultaneously, they would look like this:

 

[pic]

 

By the 1660's, detailed tables of the movements of these moons had been developed by Borelli (1665) and Cassini (1668). Naturally these tables were based mainly on observations taken around the time when Jupiter is nearly "in opposition", which is to say, when the Earth passes directly between Jupiter and the Sun, because this is when Jupiter appears high in the night sky. The mean orbital periods of Jupiter's four largest moons were found to be 1.769 days, 3.551 days, 7.155 days, and 16.689 days, and these are very constant and predictable (especially for the two inner moons), like a giant clockwork. (In fact, there were serious attempts in the 18th century to develop a system of tables and optical instruments so that the "Jupiter clock" could be used by sailors at sea to determine Greenwich Meridian time, from which they could infer their longitude.) Based on these figures it was possible to predict within minutes the times of eclipses and passages (i.e., the passings behind and in front of Jupiter) that would occur during the viewing opportunities in future "oppositions". In particular, the innermost satellite, Io (which is just slightly larger than our own Moon), completes one revolution around Jupiter every 42.456 hours. Therefore, when viewed from the Earth, we expect to see Io pass behind Jupiter once every 42 hours, 27 minutes, and 21 seconds - assuming the light from each such eclipse takes the same amount of time to reach the Earth.

 

By the 1670's people began to make observations of Jupiter's moons from the opposite side of the Earth's orbit, i.e., when the Earth was on the opposite side of the Sun from Jupiter, and they observed a puzzling phenomenon. Obviously it's more difficult to make measurements at these times, because the Jovian system is nearly in conjunction with the Sun, but at dawn and dusk it is possible to observe Jupiter even when it is fairly close to conjunction. These observations, taken about 6 months away from the optimum viewing times, reveal that the eclipses and passages of Jupiter's innermost moon, Io, which could be predicted so precisely when Jupiter is in opposition, are consistently late by about 17 minutes relative to their predicted times of occurrence. (Actually the first such estimate, made by the Danish astronomer Ole Roemer in 1675, was 22 minutes.) This is not to say that the time intervals between successive eclipses is increased by 17 minutes, but that the absolute time of occurrence is 17 minutes later than was predicted six months earlier based on the observed orbital period at that time. Since Io has a period of 1.769 days, it completes about 103 orbits in six months, and it appears to lose a total of 17 minutes during those 103 orbits, which is an average of about 9.9 seconds per orbit.

 

Nevertheless, at the subsequent "opposition" viewing six months later, Io is found to be back on schedule! It's as if a clock runs slow in the mornings and fast in the afternoons, so that on average it never loses any time from day to day. While mulling over this data in 1675, it occurred to Roemer that he could account for the observations perfectly if it is assumed that light propagates at a finite speed. At last someone had observed the mora luminus. Light travels at a finite speed, which implies that when we see things we are really seeing how they were at some time in the past. The further away we are from an object, the greater the time delay in our view of that object. Applying this hypothesis to the observations of Jupiter's moons, Roemer considered the case when Jupiter was in opposition on, say, January 1, so the light from the Jovian eclipses was traveling from the orbit of Jupiter to the orbit of the Earth, as shown in the figure below.

 

[pic]

 

The intervals between successive eclipses around this time will be very uniform near the opposition point, because the eclipses themselves are uniform and the distance from Jupiter to the Earth is fairly constant during this time. However, after about six and a half months (denoted by July 18 in the figure), Jupiter is in conjunction, which means the Earth is on the opposite side of it's orbit from Jupiter. The light from the "July 18" eclipse will still cross the Earth's orbit (on the near side) at the expected time, but it must then travel an additional distance, equal to the diameter of the Earth's orbit, in order to reach the Earth. Hence we should expect it to be "late" by the amount of time required for light to travel the Earth's orbital diameter. Combining this with a rough estimate of the distance from the Earth to the Sun, Huygens reckoned that light must travel at about 209,000 km/sec. A subsequent estimate by Newton gave a value around 241,000 km/sec. In fact, the Scholium to Proposition 96 of Newton's Principia includes the statement

 

For it is now certain from the phenomenon of Jupiter's satellites, confirmed by the observations of different astronomers, that light is propagated in succession, and requires about seven or eight minutes to travel from the sun to the earth.

 

The early quantitative estimates of the speed of light were obviously impaired by the lack of precise knowledge of the Earth-Sun distance. Using modern techniques, the Earth's orbital diameter is estimated to be about 2.98 x 1011 meters, and the observed time delay in the eclipses and passages of Jupiter's moons when viewed from the Earth with Jupiter in conjunction is about 16.55 minutes = 993 seconds, so we can deduce from these observations that the speed of light is about 2.98 x 1011 / 993  ≈  3 x 108 meters/sec.

 

Of course, Roemer's hypothesis implies a specific time delay for each point of the orbit, so it can be corroborated by making observations throughout the year. We find that most of the discrepancy occurs during the times when the distance between Jupiter and the Earth is changing most rapidly, which is when the Earth-Sun axis is nearly perpendicular to the Jupiter-Sun axis. At one of these positions the Earth is moving almost directly toward Jupiter, and at the other it is moving almost directly away from Jupiter, as shown in the figure below.

 

[pic]

 

The Earth's speed relative to Jupiter at these points is essentially just its orbital speed, which is the circumference of its orbit divided by one year. Thus we have

 

[pic]

 

which is equivalent to about 3 x 104 meters/sec. If we choose units so that c = 1, then we have v = 0.0001. From this point of view the situation can be seen as a simple application of the Doppler effect, and the frequency of the eclipses as viewed on Earth can be related to the actual frequency (which is what we observe at conjunction and opposition) according to the formulas

 

[pic]

 

The frequencies are inversely proportional to the time intervals between eclipses. These formulas imply that, for the moon Io, whose orbital period is 1.769 days = 2547.3600 minutes, the time interval between consecutive observed eclipses when the Earth is moving directly toward Jupiter (indicated as "Jan" in the above figure) is 2547.1052 minutes, and the time intervals between successive observed eclipses six months later is 2547.6147 minutes. Thus the interval between observed eclipses is 15.2 seconds shorter than nominal in the former case, and it is 15.3 seconds longer than nominal in the latter case, making a total difference of 30.5 seconds between the inter-arrival times at the two extremes, separated by six months. It would have been difficult to keep time this accurately in Roemer's day, but differences of this size are easily measured with modern clocks. By the way, the other moons of Jupiter do not conform so nicely to Roemer's hypothesis, but this is because their orbital motions are inherently more irregular due to their mutual gravitational interactions.

 

Despite the force of Roemer's analysis, and the early support of both Huygens and Newton, most scientists remained skeptical of the idea of a finite speed of light. It was not until 50 years later, when the speed of light was evaluated in a completely different way, arriving at nearly the same value, that the idea became widely accepted. This occurred in 1729, when the velocity of light was estimated by James Bradley based on observations of the aberration of starlight, which he argued must depend on the ratio of the speed of light to the orbital speed of the Earth. Based on the best measurements of the limiting starlight aberration 20.4" ± 0.1" by Otto Struve, and taking the speed of the Earth to be about 30.56 km/sec from Encke's solar parallax estimate of 8.57" ± 0.04", this implied a light speed of about 308,000 km/sec.

 

Unfortunately, Encke's parallax estimates had serious problems, and he greatly under-estimated his error band. The determination of the Earth-Sun distance was a major challenge for scientists in the 18th century. Interestingly, the primary mission of Captain James Cook when he embarked in the ship Endeavour on his famous voyage in 1768 was to observe the transit of the planet Venus across the disk of the Sun from the vantage point of Tahiti in the South seas, with the aim of determining the distance from the Sun to the Earth by parallax. Roughly once each century two such conjunctions are visible from the Earth, occurring eight years apart. Edmund Halley had urged that when the next opportunities arose on June 6, 1761, and June 3, 1769, it be observed from as many vantage points as possible on the Earth's surface to make the best possible determination. The project was undertaken by people from many countries. Le Gentil traveled to India for the 1761 transit, but since England and France were antagonists at the time, he had to dodge the English war ships, causing him to reach India just after June 6, missing the first transit of Venus. Determined not to miss the second one, he remained in India for the next eight years (!) "doing various useful work" (according to Pannekoek) until June 3 1769. Alas, when the day arrived, it was too cloudy to see anything.

 

Cook's observations fared somewhat better. The French government actually issued orders to its war ships to leave Cook alone, since he was "on a mission for the benefit of all mankind". The Endeavour arrived in Tahiti on April 13, 1769, and the scientists were able to make observations in the clear on June 3. Unfortunately, the results were disappointing, not only in Tahiti, but all over the world. It turned out to be extremely difficult to judge precisely (to within, say, 10 seconds) when one edge of Venus passed the border of the Sun. The black disk of the planet appeared to "remain connected like a droplet" to the border of the Sun, until suddenly the connection was broken and the planet was seen to be well past the border. Observers standing right next to each other recorded times differing by tens of seconds. Consequently the observations failed to yield an improved estimate of the Earth-Sun distance.

 

The first successful quantification of c based solely on terrestrial measurements was probably Fizeau's in 1849, using a toothed wheel, and then Foucault's experiment in 1862 using rotating mirrors. The toothed-wheel didn't work very well, and it was hard to say how accurate it was, but the rotating mirrors led to a value of about 298,000 ± 500 km/sec, significantly below the earlier estimates. Foucault was confident the discrepancy with earlier results couldn't be explained by an error in the aberration angle, so he inferred (correctly) that Encke's Solar parallax estimate (and therefore the orbital velocity of the Earth) was in error, and proposed a value of 8.8", which was subsequently confirmed and refined by new observations, as well as a re-analysis of Encke's 1769 data using better longitudes and yielding an estimate of 8.83".

 

To increase the speed of switching the light signal, Kerr cells were used. These rely on the fact that the refractivity of certain substances can be made to vary with an applied electric voltage. Further refinements led to large-baseline devices, called geodimeters, originally intended for use in geodesic surveying. Here is a summary of the major published determinations of the speed of optical light based on one or another of these techniques:

 

[pic]

 

Measurements of the speed of electromagnetic waves in the radio frequency have also been made, with the results summarized below:

 

[pic]

 

In addition, the speed of light can be determined indirectly by measuring the ratio of electric to magnetic units, which amounts to measuring the permittivity of the vacuum. Some result given by this method are summarized below:

 

[pic]

 

(Several of the above values include corrections for various group-velocity indices.) A plot of the common logarithm of the tolerance versus the year for the 19 optical light speed measurements is shown below:

 

[pic]

 

Interestingly, comparing each of the measured values with Evenson's 1973 value, we find that more than half of them were in error by more than their published tolerances. This is not so surprising when we note that most of the tolerances were quoted as "one sigma" error bands rather than as absolute limits. Indeed, if we consider the two-sigma band, there were only four cases of over-optimism, and of those, all but Foucault's 1862 result are within three sigma, and even Foucault is within four sigma. This is roughly in agreement with what one would expect, especially for delicate and/or indirect measurements. Also, the aggressive error estimates in this field have had the beneficial effect of spurring controversies between different researchers, forcing them to repeat experiments and refine their techniques in order to resolve the disagreements. In this way, knowledge of the speed of light progressed in less than 400 years from Galileo's assessment, "extraordinarily fast", to the best modern value, 299,792.4574 ± 0.0012 km/sec. Today the unit of length is actually defined as a specific number of wavelengths of light of a certain frequency, based on the known value of the speed of light, so in effect we now define the meter such that the speed of light is exactly 299792.4574 km/sec.

 

Incidentally, Maxwell once suggested (in his article on Ether for the ninth edition of the Encyclopedia Britannica) that Roemer's method could be used to test for the isotropy of light speed, i.e., to test whether the speed of light is the same in all directions. After noting that any purely terrestrial measurement would yield an effect only of the second order in v/c, which he regarded as “quite insensible” (a remark that spurred Albert Michelson to successfully measure just such a quantity only two years later), he wrote

 

The only practicable method of determining directly the relative velocity of the aether with respect to the solar system is to compare the values of the velocity of light deduced from the observation of the eclipses of Jupiter's satellites when Jupiter is seen from the earth at nearly opposite points of the ecliptic.

 

Notice that, for this type of observation, the relevant speed is not the speed of the earth in its orbit around the sun, but rather the speed of the entire solar system. Roemer's method can be regarded as a means of measuring the speed of light in the direction from Jupiter to the Earth, and since Jupiter has an orbital period of about 12 years, we can use this method to evaluate the speed of light several times over a 12 year period, and thus evaluate the speed in all possible directions (in the plane of the ecliptic). If the sun was stationary, we would not expect to find any differences, but it was already suspected in Maxwell’s time that the sun itself is in motion. The best modern estimate is that our solar system is moving with a speed of about 3.7 x 105 meters per second with respect to the cosmic microwave background radiation (i.e., the frame in which the radiation is roughly isotropic). If we assume a pre-relativistic model in which light propagates at a fixed speed with respect to the background radiation, and in which frames are related by Galilean transformations, we could in principle determine the "absolute speed" of the solar system. The magnitude of the effect is given by computing how much difference would be expected in the time for light to traverse one orbital diameter of the Earth at an effective speed of c+V and c−V, where V is the presumed absolute speed of the Earth. This gives a maximum difference of about 2.45 seconds between two measurements taken six years apart. (These two measurements each occur over a 6 month time span as explained above.)

 

In practice it would be necessary to account for many other uncontrolled variables, such as the variations in the orbits of the Earth and Jupiter over the six year interval. These would need to be known to much better than 1 part in 400 to give adequate resolution. To the best of my knowledge, this experiment has never been performed, because by the time sufficiently accurate clocks were available the issue of light's invariance with respect to inertial coordinate systems had already been established by more accurate terrestrial measurements, together with an improved understanding of the meaning of inertial coordinates. Today we are more likely to establish a system of coordinates optically, and then test to verify the isotropy of mechanical inertia with respect to those coordinates.

3.4  Stationary Paths

 

Then with no throbbing fiery pain,

No cold gradations of decay,

Death broke at once the vital chain,

And free’d his soul the nearest way.

                                Samuel Johnson, 1783

 

The apparent bending of visual images of objects partially submersed in water was noted in antiquity, but it wasn't until Kepler's Dioptrice, published in 1611, that anyone attempted to actually quantify the effect.  Kepler discovered that, at least for rays nearly perpendicular the surface, the ratio of the angles of incidence and refraction is (nearly) proportional to the ratio of what we now call the indices of refraction of the media.  (Originally these indices were just empirically determined constants for each substance, but Newton later showed that for most transparent media the refractive index could be taken as unity plus a term proportional to the medium's density.)  Incidentally, Kepler also noticed that with suitable materials and angles of incidence, the refracted angle can be made to exceed 90 degrees, resulting in total internal reflection, which is the basic principle of modern fiber optics. 

 

In 1621, Willebrord Snell performed a series of careful measurements and found that when a ray of light passes through a surface at which the index of refraction changes abruptly, the angles made by the incident and transmitted rays with the respective outward normals to the surface are related according to the simple formula (now called Snell's Law)

 

[pic]

 

where n1 and n2 are the indices of refraction (still regarded simply as empirical constants for any given medium) on the incident and transmitted sides of the boundary, and θ1 and θ2 are the angles that the incident ray and the transmitted ray make with the normal to the boundary as shown below.

 

[pic]

 

Soon thereafter, Descartes published his La Dioptrique (1637), in which he presented a rationalization of Snell's law based on the idea that light is a kind of pressure transmitted instantaneously (or nearly so) through an elastic medium.  Descartes' theory led to a fascinating scientific dispute over the correct interpretation of light.  According to Descartes' mechanistic description, a dense medium must transmit light more effectively, i.e., with more "force", than a less dense medium.  (He sometimes described light rays in terms of a velocity vector rather than a force vector, but in either case he reasoned that the magnitude of the vector, which he called the light's determination, increased in proportion to the density of the medium.)  Also, Descartes argued that the tangential component of the ray vector remains constant as the ray passes through a boundary.  On the basis of these two (erroneous) premises, the parallelogram of forces for a ray of light passing from a less dense to a more dense medium is as shown below.

 

[pic]

 

The magnitude of the incident force is f, and the magnitude of the refracted force if F, each of which is decomposed into components normal and tangential to the surface.  Since Descartes assumes ft = Ft, it follows immediately that  f sin(θ1) = F sin(θ2).  If, as Descartes often did, we regard the force (determination) of the light as analogous to the speed of light, then this corresponds to the relation  v1 sin(θ1) = v2 sin(θ2)  where v1 and v2 are the speeds of light in the two media.

 

Fermat criticized Descartes' derivation, partly on mathematical grounds, but also because he disagreed with the basic physical assumptions.  In particular, Fermat believed that light must not only travel at a finite speed, it must travel slower (not faster) in a denser medium.  Thus he argued that the derivation of Snell's law presented by Descartes was invalid, and he suspected the law itself might even be wrong.  In his attempts to derive the "true" law of refraction, Fermat recalled the derivation of the law of reflection given by Hero of Alexandria in ancient times.  (Actually, Fermat got this idea by way of his friend Marin Careau, who had repeated Hero's derivation in a treatise on optics in 1657.)  Hero asserted that light moves in a straight line in empty space, and reflects at equal angles when striking a mirror, for the simple reason that light prefers always to move along the shortest possible path.  As Archimedes had pointed out, the shortest path between two given points in space is a straight line, and this (according to Hero) explains why light rays are straight.  More impressively, Hero showed that when travelling from some point A to the surface of a plane mirror and then back out to some point B, the shortest path is the one for which the angles of incidence and reflection are equal.  These are ingenious observations, but unfortunately the same approach doesn't explain refraction, because in that case the shortest path between a point above water and a point below water (for example) would always be simply a straight line, and there would be no refraction at all.

 

At this point Fermat's intuition that light propagates with a characteristic finite speed, and that it moves slower in denser media, came to his aid, and he saw that both the laws of reflection and refraction (as well as rectilinear motion in free space) could be derived from the same principle if, instead of light traveling along a path that minimizes the spatial distance, we suppose it travels along the path that minimizes the temporal distance, i.e., light follows the path to its destination that will take the least possible time.  This conceptual step is fascinating for several reasons.  For one thing, we don't know on what basis Fermat "intuited" that the speed of light is not only finite (which had never yet been demonstrated), but that it possesses a fixed characteristic speed (which it must if a law of “least time” is to have any meaning), and that the speed is lower in more dense media (precisely opposite the view of Descartes and subsequently Newton and Maupertuis).  Furthermore, applying the principle of least time rather than least distance to the law of propagation of light clearly casts the propagation of light into the arena of four-dimensional spacetime, and it essentially amounts to an assertion that the laws of motion should be geodesic paths with a suitable spacetime metric.  Thus, Fermat's optical principle can be seen as a remarkable premonition of important elements of both special and general relativity.

 

To derive the law of refraction for a ray of light traveling through the boundary between two homogeneous media, Fermat argued that a ray traveling from point 1 to point 2 in the figure below would follow the path that minimized the total time of the journey.

 

[pic]

 

Letting v1 denote the speed of light in medium 1, and v2 denote the speed of light in medium 2, the total time of the journey is d1/v1 + d2/v2, which can be written in terms of the unknown  x  as

 

[pic]

 

Differentiating with respect to x gives

 

[pic]

 

Setting this to zero gives the relation

 

[pic]

 

which is equivalent to Snell's law  n1sin(θ1) = n2 sin(θ2)  provided we assume the refractive index of a medium is proportional to the inverse of the velocity of light in that medium.  Of course, since calculus hadn't been invented yet, Fermat's solution of the problem involved considerably more labor (and ingenuity) than shown above, but eventually he arrived at this result, which surprisingly was experimentally indistinguishable from the formula arising from Descartes' derivation, despite the fact that it was based on an opposite set of assumptions, namely, that the velocity (or the "force") of light in a given medium is directly proportional to the refractive index of that medium! 

 

It may seem strange that two opposite hypotheses as to the speed of light should lead to the same empirical result, but in fact without the ability to directly measure the speed of light in various media we cannot tell from the refractivities of materials whether the index is proportional to velocity or to the reciprocal of velocity.  Even though both assumptions lead to the same law of refraction, the dispute over the correct derivation of this law continued unabated, because each side regarded the other side's interpretation as a travesty of science.   Among those who believed light travels faster in denser media are Hooke and Newton, whereas Huygens derived the law of refraction based on his wave theory of light (see Section 8.9) and concluded that Fermat's hypothesis was correct, i.e., the speed of light was less in denser media.

 

More than a century later (around 1747) Maupertuis applied his "principle of least action" to give an elegant (albeit spurious) derivation of Snell's law from the hypothesis that light travels faster in denser media.  Maupertuis believed that the wisdom and economy of God is manifest in all the operations of nature, which necessarily proceed from start to finish in just such a way as to minimize the "quantity of action".  In a sense, this is closely akin to Fermat's principle of least time, since they are both primitive examples of what we would now call the calculus of variations.  However, Maupertuis developed an all-encompassing view of his "least action" principle, with mystical and religious implications, and he argued that it was the universal governing principle in all areas of physics, including mechanics, optics, thermodynamics, and all other natural processes.

 

Of course, the notion that the phenomena of nature must follow the "best possible" course was not new.  Plato's Phaedo quotes Socrates as saying

 

If then one wished to know the cause of each thing, why it comes to be or perishes or exists, one had to find what was the best way for it to be, or to be acted upon, or to act. On these premises then, it befitted a man to investigate only, about this and other things, what is best... he would tell me, first, whether the earth is flat or round, and then explain why it is so of necessity, saying which is better, and that it was better to be so...  I was ready to find out in the same way about the sun and the moon and the other heavenly bodies, about their relative speed, their turnings, and whatever else happened to them, how it is best that each should act or be acted upon...

 

The innovation of Maupertuis was to suggest a quantitative measure for the vague notion of "what is best" for physical all processes, and to demonstrate that this kind of reasoning can produce valid quantitative results in a wide range of applications.  His proposal was to minimize the product of mass, velocity, and displacement.  (Subsequently Lagrange clarified this by defining the action of a system as the spatial path integral of the product of mass and velocity.)  For a system whose mass does not change, Maupertuis regarded the action as simply proportional to the product of velocity and distance traveled.  To derive the law of refraction for a ray of light traveling through the boundary between two homogeneous media, Maupertuis argued that a ray traveling from point 1 to point 2 in the figure above would follow the path that minimized the total "action"  v1d1 + v2d2 of the journey.  This is identical to the quantity that Fermat minimized, except that the speeds appear in place of their reciprocals.  Since v1 and v2 are constants, the differentiation proceeds as before, except for the inverted speed constants, and we arrive at the relation

 

[pic]

 

which is equivalent to Snell's law  n1sin(θ1) = n2 sin(θ2)  provided we assume the refractive index of a medium is proportional to the velocity of light in that medium, more or less consistent with the views of Descartes, Hooke, Newton.  Since the deviation of the refractive index from unity is known empirically to be roughly proportional to the density of the medium, this would imply that light travels faster in denser media, which Newton and the others found quite plausible.

 

No amount of experimenting with the relative refractions of various media would suffice to distinguish between these two possibilities (the refractive index being proportional to the velocity or the reciprocal velocity).  Only a direct measurement of the speed of light in two media with different refractive indices could accomplish this.  Such a measurement was not achieved until 1850, when Focault passed rays of light through a tube, and by using a rapidly rotating mirror was able to show conclusively that light takes longer to traverse the tube when it is filled with water than when filled with air.  So, after 200 years of theorizing and speculation, the question was finally settled in favor of Fermat and Huygens, i.e., the index of refraction is inversely proportional to the speed of light in the medium.

 

It's worth noting that although Fermat was closer to the truth, his principle of "least time" is not strictly correct, because the modern formulation of "Fermat's Principle" states that light travels along a path for which the time is stationary, (i.e., such that slight transverse changes in the path don't affect its length), not necessarily minimal.  In fact, it may even be maximal, as can be verified by looking at yourself in the concave surface of a shiny spoon.  The "reason" that light prefers stationary paths can be found in the theory of quantum electrodynamics and Feynman's "sum over all paths" interpretation, which shows that if neighboring paths take different amounts of time, the neighboring rays arrive at the destination out of phase, and cancel each other out, whereas they reinforce each other if the neighboring paths take the same amount of time, or differ by some whole multiple of the wave.  A stark demonstration of this is given by diffraction gratings, in which the canceling regions of a mirror are scraped away, resulting in reflective properties that violate Hero's law of equal angles. 

 

The modified version of Fermat’s Principle (requiring stationary rather than minimal paths) has proven to be a remarkably useful approach to the formulation of all kinds of physical problems involving motion and change.  Also, subsequent optical experiments confirmed Fermat’s intuition that the index of refraction for a given medium was inversely proportional to the (phase) velocity  v  of light in the medium.  The modern definition of the refractive index is  n = c/v,  where the constant of proportionality c is the speed of light in a vacuum.  (The fact that Fermat and Descartes could reach identical conclusions, even though one assumed the index of refraction was proportional to v  while the other assumed it was proportional to 1/v is less surprising when we recall that this is precisely the crucial symmetry for relativistic velocity-composition, as described in Section 1.8.) 

 

In any case, it's clear that Fermat's model of optics based on his principle of least time, when interpreted as a metrical theory, entails or suggests many of the important elements of the modern theory of relativity, including the fundamental assumption of a characteristic speed of light for each medium, the concept of a unified space-time as the effective arena of motion, and the assumption that natural motions follow geodesic paths.

3.5  A Quintessence of So Subtle A Nature

 

For his art did express

A quintessence even from nothingness,

From dull privations and lean emptiness;

He ruined me, and I am re-begot

Of absence, darkness, death; things which are not.

                                                            John Donne, 1633

 

Descartes (like Aristotle before him) believed that nature abhors a vacuum, and insisted that the entire universe, even regions that we commonly call "empty space", must be filled with (or, more precisely, must consist of) some kind of substance. He believed this partly for philosophical reasons, which might be crudely summarized as "empty space is nothingness, and nothingness doesn't exist". He held that matter and space are identical and co-extant (ironically similar to Einstein's later notion that the gravitational field is identical with space). In particular, Descartes believed an all-pervasive substance was necessary to account for the propagation of light from the Sun to the Earth (for example), because he rejected any kind of "action at a distance", and he regarded direct mechanical contact (taken as a primitive operation) as the only intelligible means by which two objects can interact. He conceived of light as a kind of pressure, transmitted instantaneously from the source to the eye through an incompressible intervening medium. Others (notably Fermat) thought it more plausible that light propagated with a finite velocity, which was corroborated by Roemer's 1675 observations of the moons of Jupiter. The discovery of light's finite speed was a major event in the history of science, because it removed any operational means of establishing absolute simultaneity. The full significance of this took over two hundred years to be fully appreciated.

 

More immediately, it was clear that the conception of light as a simple pressure was inadequate to account for the different kinds of light, i.e., the phenomenon of color. To remedy this, Robert Hooke suggested that the (longitudinal) pressures transmitted by the ether may be oscillatory, with a frequency corresponding to the color. This conflicted with the views of Newton, who tended to regard light as a stream of particles in an empty void. Huygens advanced a fairly well-developed wave theory, but could never satisfactorily answer Newton's objections about the polarization of light through certain crystals ("Iceland spar"). This difficulty, combined with Newton's prestige, made the particle theory dominant during the 1700's, although many people, notably Jean Bernoulli and Euler, held to the wave theory.

 

In 1800 Thomas Young reconciled polarization with the wave theory by postulating the light actually consists of transverse rather than longitudinal waves, and on this basis - along with Fresnel's explanation of diffraction in terms of waves - the wave theory gained wide acceptance. However, Young's solution of the polarization problem immediately raised a new one, namely, how a system of transverse waves could exist in the ether, which had usually been assumed to be akin to a tenuous gas or fluid. This prompted generations of physicists, including Navier, Stokes, Kelvin, Malus, Arago, and Maxwell to become actively engaged in attempts to explain optical phenomena in terms of a material medium; in fact, this motivated much of their work in developing the equations of state for elastic media, which have proven to be so useful for the macroscopic treatment of fluids. However, despite the fruitfulness of this effort for the development of fluid dynamics, no one was ever able to accurately account for all optical and electro-magnetic phenomena in terms of the behavior of an ordinary fluid medium, with or without viscosity and/or compressibility.

 

There were a number of reasons for this failure. First, an ordinary fluid (even a viscous fluid) can't sustain shear stresses at rest, so it can propagate only longitudinal waves, as opposed to the transverse wave structure of light implied by the phenomenon of polarization. This implies either that the luminiferous ether must be a solid, or else we must postulate some kind of persistent dynamics (such as vortices) in the fluid so that it can sustain shear stresses. Unfortunately, both of these alternatives lead to difficulties. The assumption of a solid ether is difficult to reconcile with the fact that the equations of state for ordinary elastic solids always yield longitudinal waves accompanying any transverse waves - typically with different velocities. Such longitudinal disturbances are never observed with respect to optical phenomena. On the other hand, the assumption of a fluid ether with persistent flow patterns to sustain the required shear stresses entails a highly coordinated and organized system of flow cells that could persist only with the active participation of countless tiny “Maxwell demons” working furiously at each point to sustain it. Lacking this, the vortices are inherently unstable (even in an ideal perfect inviscid fluid, in which vorticity is strictly conserved), so these flow cells could not exert the effects on ordinary matter that they must if they are to serve as the mechanism of electromagnetic forces. Even the latter-day concept of an ether consisting of a superfluid (i.e., the viscosity-free quantum hydrodynamical state achieved by some substances such as helium when cooled to near absolute zero) faces the same problem of sustaining its specialized state while simultaneously interacting with ordinary matter in the required ways. As Maxwell acknowledged

 

No theory of the constitution of the ether has yet been invented which will account for such a system of molecular vortices being maintained for an indefinite time without their energy being gradually dissipated into that irregular agitation of the medium which, in ordinary media, is called heat.

 

Thus, ironically, the concept of transverse waves - proposed by Young and Fresnel as a means of accounting for polarization of light in term of a mechanical wave propagating in some kind of material ether - immediately led to considerations that ultimately undermined confidence in the physicality and meaningfulness of that ether.

 

Even aside from the difficulty of accounting for exclusively transverse waves in a material medium, the idea of a substantial ether filling all of space had always faced numerous difficulties. For example, Newton had shown (in his demolition of Descartes' vortex theory) that the evidently drag-free motion of the planets and comets was flatly inconsistent with the presence of any significant density of interstitial fluid. This problem is especially acute when we remember that, in order to account for the high speed of light, the density and rigidity of the putative ether must be far greater than that of steel. Serious estimates of the density of the ether varied widely, but ran as high as 1000 tons per cubic millimeter. It is then necessary to explain the interaction between this putative material ether and all other known substances. Since the speed of light changes in different material media, there is clearly a significant interaction, and yet apparently this interaction does not involve any appreciable transfer of ordinary momentum (since otherwise the unhindered motions of the planets are inexplicable).

 

One interesting suggestion was that it might be possible to account for the absence of longitudinal waves by hypothesizing a fluid that possesses vanishingly little resistance to compression, but extremely high rigidity with respect to transverse stresses. In other words, the shear stresses are very large, while the normal stresses vanish. The opposite limit is easy to model with the Navier-Stokes equation by setting the viscosity to zero, which gives an ideal non-viscous fluid with no shear stresses and with the normal stresses equal to the pressure. However, we can't use the ordinary Navier-Stokes equations to represent a substance of high viscosity and zero pressure, because this would simply zero density, and even if we postulate some extremely small (but non-zero) pressure, the normal stresses in the Navier-Stokes equations have components that are proportional to the viscosity, so we still wouldn't be rid of them. We'd have to postulate some kind of adaptively non-isotropic viscosity, and then we wouldn't be dealing with anything that could reasonably be called an ordinary material substance.

 

As noted above, the intense efforts to understand the dynamics of a hypothetical luminiferous ether fluid led directly to modern understanding of fluid dynamics, as modeled by the Navier-Stokes equation for fluids of arbitrary viscosity and compressibility. This equation can be written in vector form as

 

[pic]

 

where p is the pressure, ρ is the density, F the external force vector (per unit mass), ν is the kinematic viscosity, and V is the fluid velocity vector. If the fluid is incompressible then the divergence of the velocity is zero, so the last term vanishes. It’s interesting to consider whether anything can be inferred about the vacuum from this equation. By definition, a vacuum has vanishing density, pressure, and viscosity - at least in the ordinary senses of those terms. Setting these quantities to zero, and in the absence of any external force F, the above equation reduces to dV/dt = −∇p/ρ. Since both p and ρ are to equal zero, this equations can only be evaluated on the basis of some functional relationship between those two variables. For example, we may assume the ideal gas law, p = ρRT where R is the gas constant and T is temperature. In that case we can evaluate the limit of ∇p/ρ as p and ρ approach zero to give

 

[pic]

 

This rather ghostly proposition apparently describes the disembodied velocity and temperature of a medium possessing neither density nor heat capacity. In a sense it is a medium of pure form and no substance. Of course, this is physically meaningless unless we can establish a correspondence between the terms and some physically observable effects. It was hoped by Stokes, Maxwell, and others that some such identification of terms might enable a limiting case of the Navier-Stokes equation to represent electromagnetic phenomena, but the full delineation of Maxwell's equations for electromagnetism make it clear that they do not describe the movement of any ordinary material substance, which of course was the basis for Navier-Stokes equation.

 

Another interesting suggestion was that the luminiferous ether might consist of a substance whose constituent parts, instead of resisting changes in their relative distances (translation), resist changes in orientation. A theory along these lines was proposed by MacCullagh in 1839, and actually led to some of the same formulas as Maxwell's electromagnetic theory. This is an intriguing fact, but it doesn't represent an application (or even an adaptation) of the equations of motion for either an ordinary elastic substance, whether gas, fluid, or solid. It's more properly regarded as an abstract mathematical model with only a superficial resemblance to descriptions of the behavior of material substances.

 

Some of the simplest material ether theories were ruled out simply on the basis of first-order optical phenomena, especially stellar aberration. For example, Stokes' theory of complete convection could correctly model aberration (to first order) only with a set of special hypotheses as to the propagation of light, hypotheses that Lorentz later showed to be internally inconsistent. (Stokes erroneously assumed the velocity of a potential flow stream around a sphere is zero at the sphere’s surface.) Fresnel's theory of partial convection was (more or less) adequate, up until it became possible to measure second-order effects, at which point it too was invalidated. But regardless of their empirical failures, none of these theories really adhered to the laws of ordinary fluid mechanics. William Thomson (Lord Kelvin), who was perhaps the most persistent of all in the attempt to represent electromagnetic phenomena in terms of the mechanics of ordinary macroscopic substances, aptly summarized the previous half-century of progress in this line of research at a jubilee in his honor in 1896:

 

One word characterizes the most strenuous efforts for the advancement of science that I have made perseveringly during fifty-five years; that word is failure. I know no more of electric and magnetic force, or of the relation between ether, electricity, and ponderable matter… than I knew… fifty years ago.

 

We might think this assessment was too harsh, especially considering that virtually the entire science of classical electromagnetism - based on Maxwell’s equations - was developed during the period in question. However, in the course of this development Maxwell and his followers had abandoned the effort to find mechanical analogies, and Kelvin equated progress with finding a mechanical analogy. The failure to find any satisfactory mechanical model for electromagnetism led to the abandonment of the principle of qualitative similarity, which is to say, it led to the recognition that the ether must be qualitatively different from ordinary substances. This belief was firmly established once Maxwell showed that longitudinal waves cannot propagate through transparent substances or free space. In so doing, he was finally able to show that all electromagnetic and optical phenomena can be explained by a single system of "stresses in the ether", which, however, he acknowledged must obey quite different laws than do the elastic stresses in ordinary material substances. E. T. Whittaker’s book “Aether and Electricity” includes a review of the work of Kelvin and others to find a mechanical model of the ether concludes that

 

Towards the close of the nineteenth century… it came to be generally recognized that the aether is an immaterial medium, sui generis, not composed of identifiable elements having definite locations in absolute space.

 

Thus by the time of Lorentz it had become clear that the "ether" was simply being arbitrarily assigned whatever formal (and often non-materialistic) properties it needed in order to make it compatible with the underlying electromagnetic laws, and therefore the "corporeal" ether concept was no longer exerting any positive heuristic benefit, but was simply an archaic appendage that was being formalistically superimposed on top of the real physics for no particular reason.

 

Moreover, although the Navier-Stokes equation is as important today for fluid dynamics as Maxwell's equations are for electrodynamics, we've also come to understand that real fluids and solids are not truly continuous media. They actually consist of large numbers of (more or less) discrete particles. As it became clear that the apparently continuous dynamics of fluids and solids were ultimately just approximations based on an aggregate of more primitive electromagnetic interactions, the motivation for trying to explain the latter as an instance of the former came to be seriously questioned. It is rather like saying gold consists of an aggregate of sub-atomic particles, and then going on to say that those sub-atomic particles are made of gold! The effort to explain electromagnetism in terms of a material fluid such as we observe on a macroscopic level, when in fact the electromagnetic interaction is a much more primitive phenomenon, appears today to have been fundamentally misguided, an attempt to model a low-level phenomenon as an instance of a higher level phenomenon.

 

During the last years of the 19th century a careful and detailed examination of electrodymanic phenomena enabled Lorentz, Poincare, and others to develop a theory of the electromagnetic ether that accounted for all known observations, but only by concluding that "the ether is undoubtedly widely different from all ordinary matter". This is because, in order to simultaneously account for aberration, polarization and transverse waves, the complete absence of longitudinal waves, and the failure of the Michelson/ Morley experiment to detect any significant ether drift, Lorentz was forced to regard the ether as strictly motionless, and yet subject to non-vanishing stresses, which is contradictory for ordinary matter.

 

Even in Einstein's famous essay on "The Ether and Relativity" he points out that although "we may assume the existence of an ether, we must give up ascribing a definite state of motion to it, i.e. we must take from it the last mechanical characteristic...". He says this because, like Lorentz, he understood that electromagnetic phenomena simply do not conform to the behavior of disturbances is any ordinary material substance - solid, liquid, or gas. Obviously if we wish to postulate some new kind of “substance” whose properties are not constrained to be those of an ordinary substance, we can "back out" whatever properties are needed to match the equations of any field theory (which is essentially what Lorentz did), but this is just an exercise in re-stating the equations in ad hoc verbal terms. Such a program has no heuristic or explanatory content. The question of whether electromagnetic phenomena could be accurately modeled as disturbances in an ordinary material medium was quite meaningful and deserved to be explored, but the answer is unequivocally that the phenomena of electromagnetism do not conform to the principles governing the behavior of ordinary material substances. In fact, we now understand that the latter are governed by the former, i.e., elementary electromagnetic interactions underlie the macroscopic behavior of ordinary material substances.

 

We shouldn't conclude this review of the ether without hearing Maxwell on the subject, since he devoted his entire treatise on electromagnetism to it. Here is what he says in the final article of that immense work:

 

The mathematical expressions for electrodynamic action led, in the mind of Gauss, to the conviction that a theory of the propagation of electric action [as a function of] time would be found to be the very keystone of electrodynamics. Now, we are unable to conceive of propagation in time, except either as the flight of a material substance through space, or as the propagation of a condition of motion or stress in a medium already existing in space...  If something is transmitted from one particle to another at a distance, what is its condition after it has left the one particle and before it has reached the other? ...whenever energy is transmitted from one body to another in time, there must be a medium or substance in which the energy exists after it leaves one body and before it reaches the other, for energy, as Torricelli remarked, 'is a quintessence of so subtle a nature that it cannot be contained in any vessel except the inmost substance of material things'. Hence all these theories lead to the conception of a medium in which the propagation takes place, and if we admit this medium as an hypothesis, I think it ought to occupy a prominent place in our investigations, and that we ought to endeavour to construct a mental representation of all the details of its action, and this has been my constant aim in this treatise.

 

Surely the intuitions of Gauss and Torricelli have been vindicated. Maxwell's dilemma about how the energy of light "exists" during the interval between its emission and absorption was resolved by the modern theory of relativity, according to which the absolute spacetime interval between the emission and absorption of a photon is identically zero, i.e., photons are transmitted along null intervals in spacetime. The quantum phase of events, which we identify as the proper time of those events, does not advance at all along null intervals, so, in a profound sense, the question of a photon's mode of existence "after it leaves one body and before it reaches the other" is moot (as discussed in Section 9). Of course, no one from Torricelli to Maxwell imagined that the propagation of light might depend fundamentally on the existence of null connections between distinct points in space and time. The Minkowskian structure of spacetime is indeed a quintessence of a most subtle nature.

 

3.6  The End of My Latin

 

Leaving the old, both worlds at once they view

That stand upon the threshold of the new.

                                                Edmund Waller, 1686

 

In his book "The Theory of Electrons" (1909) Hendrik Lorentz wrote

 

Einstein simply postulates what we have deduced, with some difficulty and not altogether satisfactorily, from the fundamental equations of the electromagnetic field.

 

This statement implies that Lorentz's approach was more fundamental, and therefore contained more meaningful physics, than the explicitly axiomatic approach of Einstein. However, a close examination of Lorentz's program reveals that he, no less than Einstein, simply postulated relativity.  To understand what Lorentz actually did - and did not - accomplish, it's useful to review the fundamental conceptual issues that he faced.

 

Given any set of equations describing some class of physical phenomena with reference to a particular system of space and time coordinates, it may or may not be the case that the same equations apply equally well if the space and time coordinates of every event are transformed according to a certain rule.  If such a transformation exists, then those equations (and the phenomena they describe) are said to be covariant with respect to that transformation.  Furthermore, if those equations happen to be covariant with respect to a complete class of velocity transformations, then the phenomena are said to be relativistic with respect to those transformations.  For example, Newton's laws of motion are relativistic, because they apply not only with respect to one particular system of coordinates x,t, but with respect to any system of coordinates x',t' related to the former system according to a complete set of velocity transformations of the form

 

[pic]

 

From the time of Newton until the beginning of the 19th century many scientists imagined that all of physics might be reducible to Newtonian mechanics, or at least to phenomena that are covariant with respect to the same coordinate transformations as are Newton's laws, and therefore the relativity of Newtonian physics was regarded as complete, in the sense that velocity had no absolute significance, and each one of an infinite set of relatively moving coordinate systems, related by (1), was equally suitable for the description of all physical phenomena.  This is called the principle of relativity, and it's important to recognize that it is just a hypothesis, similar to the principle of energy conservation.  It is the result of a necessarily incomplete induction from our observations of physical phenomena, and it serves as a tremendously useful organizing principle, but only as long as it remains empirically viable.  Admittedly we could regard complete relativity as a direct consequence of the principle of sufficient cause - within a conceptual framework of distinct entities moving in an empty void - but this is still a hypothetical proposition.  The key point to recognize is that although we can easily derive the relativity of Newton's laws under the transformations (1), we cannot derive the correctness of Newton's laws, nor can we derive the complete relativity of physics from the presumptive relativity of the dynamics of material bodies.

 

By the end of the 19th century the phenomena of electromagnetism had become well-enough developed so that the behavior of the electromagnetic field - at least on a macroscopic level - could be described by a set of succinct equations, analogous to Newton's laws of motion for material objects. According to the principle of relativity (in the context of entities in an empty void) it was natural to expect that these new laws would be covariant with the laws of mechanics.  It therefore was somewhat surprising when it turned out that the equations which describe the electromagnetic field are not covariant under the transformations (1).  Apparently the principle of complete relativity was violated.  On the other hand, if mechanics and electromagnetism are really not co-relativistic, it ought to be possible to detect the effects of an absolute velocity, whereas all attempts to detect such a thing failed.  In other words, the principle of complete relativity of velocity continued to survive all empirical tests involving comparisons of the effects of velocity on electromagnetism and mechanics, despite the fact that the (supposed) equations governing these two classes of phenomena were not covariant with respect to the same set of velocity transformations.

At about this time, Lorentz derived the fact that although Maxwell's equations (taking the permissivity and permeability of the vacuum to be invariants) of the electromagnetic field are not covariant with respect to (1), they are covariant with respect to a complete set of velocity transformations, namely, those of the form

 

[pic]

 

for a suitable choice of space and time units, where γ = (1−v2)-1/2.  This was a very important realization, because if the equations of the electromagnetic field were not covariant with respect to any complete set of velocity transformations, then the principle of relativity could only have been salvaged by the existence of some underlying medium.  The situation would have been analogous to finding a physical process in which energy is not conserved, leading us to seek for some previously undetected mode of energy.  Of course, even recognizing the covariance of Maxwell's equations with respect to (2), the principle of relativity was still apparently violated because it still appeared that mechanics and electromagnetism were incompatible.

 

Recall that Lorentz took Maxwell's equations to be "the fundamental equations of the electromagnetic field" with respect to the inertial rest frame of the luminiferous ether. Needless to say, these equations were not logically derived from more fundamental principles, they were developed by a rational-inductive method whereby observed phenomena were analyzed into a small set of simple patterns, which were then formalized into mathematical expressions.  Even the introduction of the displacement current was just a rational hypothesis.  Admittedly the historical development of Maxwell's equations was guided to some extent by mechanistic analogies, but the mechanical world-view is itself a high-level conceptual framework based on an extensive set of abstract assumptions regarding dimensionality, space, time, plurality, persistent identities, motion, inertia, and various conservation laws and symmetries.  Thus even if a completely successful mechanical model for the electromagnetic field existed, it would still be highly hypothetical. 

 

Moreover, it was already clear by 1905 that Maxwell's equations are not fundamental, since the simple wave model of electromagnetic radiation leads to the ultra-violet catastrophe, and in general cannot account for the micro-structure of radiation, leading to such things as the photo-electric effect and other quantum phenomena.  (Having just completed a paper on the photo-electric effect prior to starting his 1905 paper on special relativity, Einstein was very much aware that Maxwell's equations were not fundamental, and this influenced his choice of foundations on which to base his interpretation of electrodynamics.  It's worth noting that although Lorentz derived the transformations (2) from the full set of Maxwell's equations (with the permissivity and permeability interpreted as invariants), these transformations actually follow from just one aspect of Maxwell's equations, namely, the invariance of the speed of light.  Thus from the standpoint of logical economy, as well as to avoid any commitment to the fundamental correctness of Maxwell's equations, it is preferable to derive the Lorentz transformation from the minimum set of premises.  Of course, having done this, it is still valuable to show that, as a matter of fact, Maxwell's equations are fully covariant with respect to these transformations.

 

To summarize the progress up to this point, Lorentz derived the general transformations (2) relating two systems of space and time coordinates such that if an electromagnetic field satisfies Maxwell's equations with respect to one of the systems, it also satisfies Maxwell's equations with respect to the other.  Now, this in itself certainly does not constitute a derivation of the principle of relativity.  To the contrary, the fact that (2) is different from (1) leads us to expect that the principle of relativity is violated, and that it ought to be possible to detect effects of absolute velocity, or, alternatively, to detect some underlying medium that accounts for the difference between (2) and (1).  Lorentz knew that all attempts to detect an absolute velocity (or underlying medium) had failed, implying that the principle of complete relativity was intact, so something was wrong with the formulations of the laws of electromagnetism and/or the laws of mechanics.

 

Faced with this situation, Lorentz developed his "theorem of corresponding states", which asserts that all physical phenomena transform according to the transformation law for electrodynamics.  This "theorem" is equivalent to the proposition that physics is, after all, completely relativistic.  Since Lorentz presented this as a "theorem", it has sometimes misled people (including, to an extent, Lorentz himself) into thinking that he had actually derived relativity, and that, therefore, his approach was more fundamental or more constructive than Einstein's.  However, an examination of Lorentz's "theorem" reveals that it was explicitly based on assumptions (in addition to the false assumption that Maxwell's equations are the fundamental equations of the electromagnetic field) which, taken together, are tantamount to the assumption of complete relativity.  The key step occurs in §175 of The Theory of Electrons, in which Lorentz writes

 

We are now prepared for a theorem concerning corresponding states of electromagnetic vibration, similar to that of §162, but of a wider scope.  To the assumptions already introduced, I shall add two new ones, namely (1) that the elastic forces which govern the vibratory motions of the electrons are subjected to the relation [300], and (2) that the longitudinal and transverse masses m' and m" of the electrons differ from the mass m0 which they have when at rest in the way indicated by [305].

 

Lorentz's equation [300] is simply the transformation law for electromagnetic forces, and his equations [305] give the relativistic expressions for the transverse and longitudinal masses of a particle.  Lorentz has previously presented these expressions as

 

...the assumptions required for the establishment of the theorem, that the systems S and S0 can be the seat of molecular motions of such a kind that, in both, the effective coordinates of the molecules are the same function of the effective time.

 

In other words, these are the assumptions required in order to make the theorem of corresponding states (i.e., the principle of relativity) true.  Hence Lorentz simply postulates relativity, just as did Galileo and Einstein, and then backs out the conditions that must be satisfied by mechanical objects in order to make relativity true.  Needless to say, if we assume these conditions, we can then easily prove the theorem, but this is tautological, because these conditions were simply defined as those necessary to make the theorem true.  Not surprisingly, if someone just focuses on Lorentz's "proof", without paying attention to the assumptions on which it is based, he might be misled into thinking that Lorentz derived relativity from some more fundamental considerations.  This arises from confusion over what Lorentz was actually doing.  He was primarily deriving the velocity transformations with respect to which Maxwell's equations are covariant, after which he proceeded to determine how the equations of mechanics would need to be modified in order for them to be covariant with respect to these same transformations.  He did not derive the necessity for mechanics to obey these revised laws, any more than Einstein or Newton did.  He simply assumed it, and indeed he had no choice, because the laws of mechanics do not follow from the laws of electromagnetism.  Why, then, does the myth persist (in some circles) that Lorentz somehow derived relativity?

 

To answer this question, we need to examine Lorentz's derivation of the theorem of corresponding states in greater detail.  First, Lorentz justified the contraction of material objects in the direction of motion (with respect to the ether frame) on the basis of his "molecular force hypothesis", which asserts that the forces responsible for maintaining stable configurations of matter transform according to the electromagnetic law.  This can only be regarded as a pure assumption, rather than a conclusion from electromagnetism, for the simple reason that the molecular forces are necessarily not electromagnetic, at least not in the Maxwellian sense.  Maxwell's equations are linear, and it is not possible to construct bound states from any superposition of linear solutions.  Hence Lorentz's molecular force hypothesis cannot legitimately be inferred from electromagnetism.  It is a sheer hypothesis, amounting to the simple assumption that all intrinsic mechanical aspects of material entities are covariant with electromagnetism.

 

Second, and even more importantly, Lorentz justifies the applicability of the "effective coordinates" for the laws of mechanics of material objects by assuming that the inertial masses (both transverse and longitudinal) of material objects transform in the same way as do the "electromagnetic masses" of a charged particle arising from self-reaction.  Admittedly it was once hoped that all inertial mass could be attributed to electromagnetic self-reaction effects, which would have provided some constructive basis for Lorentz's assumption, but we now know that only a very small fraction of the effective mass of an electron is due to the electromagnetic field.  Again, it is simply not possible to account for bound states of matter in terms of Maxwellian electromagnetism, so it does not logically follow that the mechanics of material objects are covariant with respect to (2) simply because the electromagnetic field is covariant with respect to (2).  Of course, we can hypothesize that this is case, but this is simply the hypothesis of complete physical relativity.

 

Thus Lorentz did not in any way derive the fact that the laws of mechanics are covariant with respect to the same transformations as are the laws of electromagnetism.  He simply observed that if we assume they are (and if we assume every other physical effect, even those presently unknown to us, is likewise covariant), then we get complete physical relativity - but this is tautological.  If all the laws of physics are covariant with respect to a single set of velocity transformations (whether they are of the form (1) or (2) or any other), then by definition physics is completely relativistic.  The doubts about relativity that arose in the 19th century were due to the apparent fact that the laws of mechanics and the laws of electromagnetism were not covariant with respect to the same set of velocity transformations.  Obviously if we simply assume that they are covariant with respect to the same transformations, then the disparity is resolved, but it's important to recognize that this represents just the assumption - not a derivation - of the principle of relativity.

 

An alternative approach to preserving the principle of relativity would be to assume that electromagnetism and mechanics are actually both covariant with respect to the velocity transformations (1).  This would necessitate modifications of Maxwell's equations, and indeed this was the basis for Ritz's emission theory.  However, the modifications that Ritz proposed eventually led to conflict with observation, because according to the relativity based on (1) speeds are strictly additive and there is no finite upper bound on the speed of energy propagation. 

 

The failure of emission theories illustrates the important fact that there are two verifiable aspects of relativistic physics.  The first is the principle of relativity itself, but this principle does not fully determine the observable characteristics of phenomena, because there is more than one possible relativistic pattern, and these patterns are observationally distinguishable.  This is why relativistic physics is founded on two distinct premises, one being the principle of relativity, and the other being some empirical proposition sufficient to identify the particular pattern of relativity (Euclidean, Galilean, Lorentzian) that applies.  Lorentz’s theorem of corresponding states represents the second of these premises, whereas the first is simply assumed, consistent with the apparent relativity of all observable phenomena.  Einstein’s achievement in special relativity was essentially to show that Lorentz’s results (and more) actually follow unavoidably from just a small subset of his assumptions, and that these can be consistently interpreted as primitive aspects of space and time.

 

The first published reference to Einstein's special theory of relativity appeared in a short note by Walter Kaufmann reporting on his experimental results involving the deflection of electrons in an electromagnetic field.  Kaufmann's work was intended as an experimentum crucis for distinguishing between the three leading theories of the electron, those of Abraham, Bucherer, and Lorentz.  In his note of 30 November 1905, Kaufmann wrote

 

In addition there is to be mentioned a recent publication of Mr. A. Einstein on the theory of electrodynamics which leads to results which are formally identical with those of Lorentz's theory.  I anticipate right away the general result of the measurements to be described in the following:  the results are not compatible with the Lorentz-Einstein fundamental assumptions.

 

Kaufmann's results were originally accepted by most physicists as favoring the Abraham theory, but gradually people began to have doubts.  Although the results disagreed with the Lorentz-Einstein model, the agreement with Abraham's theory was not particularly good either.  This troubled Planck, so he conducted a careful analysis of Kaufmann's experiment and his analysis of the two competing theories.  It was an interesting example of scientific "detective work" by Planck.

 

Kaufmann in 1905 had measured nine characteristic deflections d1,d2,..,d9 for electrons passing though nine different field strengths.  Then he had computed the nine values that would be predicted by Abraham's theory, and the nine values that would be predicted by Lorentz-Einstein.  However, in order to derive the "predictions" from the theories for his particular experimental setup he needed to include an attenuation factor "k" on the electric field strength.  This factor is actually quite a complicated function of the geometry of the plates and coils used to establish the electric field.  Kaufamnn selected a particular value of "k" that he thought would be reasonable.

 

Now, both the Abraham and the Lorentz-Einstein theory predicted the electron's velocity could never exceed c, but Planck noticed that Kaufmann's choice of k implied a velocity greater than c for at least one of the data points, and therefore was actually inconsistent with both theories.  This caused Planck to suspect that perhaps Kaufmann's assumed value of k was wrong.  Unfortunately the complexity of the experimental setup made it impossible to give a firm determination of the attenuation factor from first principles, but Planck was nevertheless able to extract some useful information from Kaufmann's data.

 

Planck took the nine data points and "backed out" the values of k that would be necessary to make them agree with Abraham's theory.  Then he did the same for the Lorentz-Einstein theory.  All these values of  k  were well within the range of plausibility (given the uncertainty in the experimental setup), so nothing definite could be concluded, but Planck noted that the nine k-values necessary to match the Lorentz-Einstein theory to the measurements were all nearly equal, whereas the nine k-values necessary to match Abraham showed more variation.  From this, one might actually infer a slight tilt in favor of the Lorentz-Einstein theory, simply by virtue of the greater consistency of k values.

 

Naturally this inconclusive state of affairs led people to try to think of an experiment that would be more definitive.  In 1908 Bucherer performed a variation of Kaufmann's experiment, but with an experimental setup taking Planck's analysis into account, so that uncertainty in the value of k basically "cancels out".  Bucherer's results showed clear agreement with the Lorentz-Einstein theory and disagreed with the Abraham theory.  Additional and more refined experiments were subsequently performed, and by 1916 it was clear that the experimental evidence did in fact support what Kaufmann had called "the Lorentz-Einstein fundamental assumptions".

 

Incidentally, it's fascinating to compare the reactions of Lorentz, Poincare, and Einstein to Kaufmann's results.  Lorentz was ready to abandon his entire model (and life's work) since it evidently conflicted with this one experiment.  As he wrote to Poincare in 1906, the length contraction hypothesis was crucial for the coherence of his entire theoretical framework, and yet

 

Unfortunately my hypothesis of the flattening of electrons is in contradiction with Kaufmann's results, and I must abandon it.  I am, therefore, at the end of my Latin.

 

Poincare agreed that, in view of Kaufmann's results "the entire theory may well be threatened".  It wasn't until the announcement of Bucherer's results that Lorentz regained confidence in his own theoretical model.  Interestingly, he later cited those results as one of the main reasons for his eventual acquiescence with the relativity principle, noting that if Lorentz-covariance is actually as comprehensive as these experimental results show it to be, then the ether concept is entirely devoid of heuristic content.  (On the other hand, he did continue to maintain that there were some benefits in viewing things from the standpoint of absolute space and time, even if we are not at present able to discern such things.)

 

Einstein's reaction to Kaufmann's apparently devastating results was quite different.  In a review article on relativity theory in 1907, Einstein acknowledged that his theory was in conflict with Kaufmann's experimental results, and he could find nothing wrong with either Kaufmann's experiment or his analysis, which seemed to indicate in favor of Abraham's theory over relativity.  Nevertheless, the young patent examiner continued

 

It will be possible to decide whether the foundations of the relativity theory correspond with the facts only if a great variety of observations is at hand...  In my opinion, both [the alternative theories of Abraham and Bucherer] have rather slight probability, because their fundamental assumptions concerning the mass of moving electrons are not explainable in terms of theoretical systems which embrace a greater complex of phenomena.  A theory is the more impressive the greater the simplicity of its premises, the more different kinds of things it relates, and the more extended is its area of applicability.

 

This is a remarkable defense of a scientific theory against apparent experimental falsification.  While not directly challenging the conflict between experiment and theory, Einstein nevertheless maintained that we should regard relativity as most likely correct, essentially on the basis of it's scope and conceptual simplicity.  Oddly enough, when later confronted with similar attempts to justify other people's theories, Einstein was fond of saying that "a theory should be as simple as the facts allow - but no simpler".  Yet here we find him serenely confident that the "facts" rather than his theory will ultimately be overturned, which turned out to be the case.  This sublime confidence in the correctness of certain fundamental ideas was a characteristic of Einstein throughout his career.  When asked what he would have done if the eclipse observations had disagreed with the prediction of general relativity for the bending of light, Einstein replied "Then I would have felt sorry for the dear lord, because the theory is correct."

3.7  Zeno and the Paradox of Motion

 

We may say a thing is at rest when it has not changed its position between now and then, but there is no ‘then’ in ‘now’, so there is no being at rest. Both motion and rest, then, must necessarily occupy time.

                                                                                                                Aristotle, 350 BC

 

The Eleatic school of philosophers was founded by the religious thinker and poet Xenophanes (born c. 570 BC), whose main teaching was that the universe is singular, eternal, and unchanging. "The all is one." According to this view, as developed by later members of the Eleatic school, the appearances of multiplicity, change, and motion are mere illusions. Interestingly, the colony of Elea was founded by a group of Ionian Greeks who, in 545 BC, had been besieged in their seaport city of Phocaea by an invading Persian army, and were ultimately forced to evacuate by sea. They sailed to the island of Corsica, and occupied it after a terrible sea battle with the navies of Carthage and the Etruscans. Just ten years later, in 535 BC, the Carthagians and Etruscans regained the island, driving the Phocaean refugees once again into the sea. This time they landed on the southwestern coast of Italy and founded the colony of Elea, seizing the site from the native Oenotrians. All this happened within the lifetime of Xenophanes, himself a wandering exile from his native city of Colophone in Ionia, from which he too had been force to flee in 545 BC. He lived in Sicily and then in Catana before finally joining the colony at Elea.  It's tempting to speculate on how these events may have psychologically influenced the Eleatic school's belief in permanent unalterable oneness, denying the reality of change and plurality in the universe.

 

The greatest of the Eleatic philosophers was Parmenides (born c. 539 BC). In addition to developing the theme of unchanging oneness, he is also credited with originating the use of logical argument in philosophy. His habit was to accompany each statement of belief with some kind of logical argument for why it must be so. It's possible that this was a conscious innovation, but it seems more likely that the habitual rationalization was simply a peculiar aspect of his intellect. In any case, on this basis he is regarded as the father of metaphysics, and, as such, a key contributor to the evolution of scientific thought.

 

Parmenides's belief in the absolute unity and constancy of reality is quite radical and abstract, even by modern standards. He maintained that the universe is literally singular and unchangeable. However, his rationalism forced him to acknowledge that appearances are to the contrary, i.e., while he flatly denied the existence of plurality and change, he admitted the appearance of these things. Nevertheless, he insisted these were mere perceptions and opinions, not to be confused with "what is". Not surprisingly, Parmenides was ridiculed for his beliefs. One of Parmenides' students was Zeno, who is best remembered for a series of arguments in which he defends the intelligibility of the Eleatic philosophy by purporting to prove, by logical means, that change (motion) and plurality are impossible.

 

We can't be sure how the historical Zeno intended his arguments to be taken, since none of his writings have survived. We know his ideas only indirectly through the writings of Plato, Aristotle, Simplicus, and Proclus, none of whom was exactly sympathetic to Zeno's philosophical outlook. Furthermore, we're told that Zeno's arguments were a "youthful effort", and that they were made public without his prior knowledge or consent. Also, even if we accept that his purpose was to defend the Eleatic philosophy against charges of logical inconsistency, it doesn't follow that Zeno necessarily regarded his counter-charges as convincing. It's conceivable that he intended them as satires of (what he viewed as) the fallacious arguments that had been made against Parmenides' ideas. In any case, although we cannot know for sure how Zeno himself viewed his "paradoxes", we can nevertheless examine the arguments themselves, as they've come down to us, to see if they contain - or suggest - anything of interest.

 

Of the 40 arguments attributed to Zeno by later writers, the four most famous are on the subject of motion:

 

The Dichotomy: There is no motion, because that which is moved must arrive at the middle before it arrives at the end, and so on ad infinitum.

The Achilles: The slower will never be overtaken by the quicker, for that which is pursuing must first reach the point from which that which is fleeing started, so that the slower must always be some distance ahead.

The Arrow: If everything is either at rest or moving when it occupies a space equal to itself, while the object moved is always in the instant, a moving arrow is unmoved.

The Stadium: Consider two rows of bodies, each composed of an equal number of bodies of equal size. They pass each other as they travel with equal velocity in opposite directions. Thus, half a time is equal to the whole time.

 

The first two arguments are usually interpreted as critiques of the idea of continuous motion in infinitely divisible space and time. They differ only in that the first is expressed in terms of absolute motion, whereas the second shows that the same argument applies to relative motion. Regarding these first two arguments, there's a tradition among some high school calculus teachers to present them as "Zeno's Paradox", and then "resolve the paradox" by pointing out that an infinite series can have a finite sum. This may be a useful pedagogical device for beginning calculus students, but it misses an interesting and important philosophical point implied by Zeno's arguments. To see this, we can re-formulate the essence of these two arguments in more modern terms, and show that, far from being vitiated by the convergence of infinite series, they actually depend on the convergence of the geometric series.

 

Consider a ray of light bouncing between an infinite sequence of mirrors as illustrated below

[pic]

On the assumption that matter, space, and time are continuous and infinitely divisible (scale invariant), we can conceive of a point-like massless particle (say, a photon) traveling at constant speed through a sequence of mirrors whose sizes and separations decrease geometrically (e.g., by a factor of two) on each step. The envelope around these mirrors is clearly a wedge shape that converges to a point, and the total length of the zigzag path is obviously finite (because the geometric series 1 + 1/2 + 1/4 + ... converges), so the particle must reach "the end" in finite time. The essence of Zeno's position against continuity and infinite divisibility is that there is no logical way for the photon to emerge from the sequence of mirrors. The direction in which the photon would be traveling when it emerged would depend on the last mirror it hit, but there is no "last" mirror. Similarly we could construct "Zeno's maze" by having a beam of light directed around a spiral as shown below:

 

[pic]

 

Again the total path is finite, but has no end, i.e., no final direction, and a ray propagating along this path can neither continue nor escape. Of course, modern readers may feel entitled to disregard this line of reasoning, knowing that matter consists of atoms which are not infinitely divisible, so we could never construct an infinite sequence of geometrically decreasing mirrors. Also, every photon has some finite scattering wavelength and thus cannot be treated as a "point particle". Furthermore, even a massless particle such as a photon necessarily has momentum according to the quantum and relativistic relation p = h/λ, and the number of rebounds per unit time – and hence the outward pressure on the structure holding the mirrors in place - increases to infinity as the photon approaches the convergent point. However, these arguments merely confirm Zeno's position that the physical world is not scale-invariant or infinitely divisible (noting that Planck’s constant h represents an absolute scale). Thus, we haven't debunked Zeno, we've merely conceded his point. Of course, this point is not, in itself, paradoxical. It simply indicates that at some level the physical world must be regarded as consisting of finite indivisible entities. We arrive at Zeno's paradox only when these arguments against infinite divisibility are combined with the complementary set of arguments (The Arrow and The Stadium) which show that a world consisting of finite indivisible entities is also logically impossible, thereby presenting us with the conclusion that physical reality can be neither continuous nor discontinuous.

 

The more famous of Zeno's two arguments against discontinuity is "The Arrow", which focuses on the instantaneous physical properties of a moving arrow. He notes that if physical objects exist discretely at a sequence of discrete instants of time, and if no motion occurs in an instant, then we must conclude that there is no motion in any given instant. (As Bertrand Russell commented, this is simply "a plain statement of an elementary fact".) But if there is literally no physical difference between a moving and a non-moving arrow in any given discrete instant, then how does the arrow know from one instant to the next if it is moving? In other words, how is causality transmitted forward in time through a sequence of instants, in each of which motion does not exist?

 

It's been noted that Zeno's "Arrow" argument could also be made in the context of continuous motion, where in any single slice of time there is (presumed to be) no physical difference between a moving and a non-moving arrow. Thus, Zeno suggests that if all time is composed of instants (continuous or discrete), and motion cannot exist in any instant, then motion cannot exist at all. A naive response to this argument is to point out that although the value of a function f(t) is constant for a given t, the function f(t) may be non-constant at t. But, again, this explanation doesn't really address the phenomenological issue raised by Zeno's argument. A continuous function (as emphasized by Weierstrass) is a static completed entity, so by invoking this model we are essentially agreeing with Parmenides that physical motion does not truly exist, and is just an illusion, i.e., "opinions", arising from our psychological experience of a static unchanging reality.

 

Of course, to accomplish this we have expanded our concept of "the existing world" to include another dimension. If, instead, we insist on adhering to the view of the entire physical world as a purely spatial expanse, existing in and progressing through a sequence of instants, then we again run into the problem of how a quality that exists only over a range of instants can be causally conveyed through any given instant in which it has no form of existence. Before blithely dismissing this concern as non-sensical, it's worth noting that modern physics has concluded (along with Zeno) that the classical image of space and time was fundamentally wrong, and in fact motion would not be possible in a universe constructed according to the classical model. We now recognize that position and momentum are incompatible variables, in the sense that an exact determination of either one of them leaves the other completely undetermined. According to quantum mechanics, the eigenvalues of spatial position are incompatible with the eigenvalues of momentum so, just as Zeno’s arguments suggest, it really is inconceivable for an object to have a definite position and momentum (motion) simultaneously.

 

The theory of special relativity answers Zeno's concern over the lack of an instantaneous difference between a moving and a non-moving arrow by positing a fundamental re-structuring the basic way in which space and time fit together, such that there really is an instantaneous difference between a moving and a non-moving object, insofar as it makes sense to speak of "an instant" of a physical system with mutually moving elements. Objects in relative motion have different planes of simultaneity, with all the familiar relativistic consequences, so not only does a moving object look different to the world, but the world looks different to a moving object.

 

This resolution of the paradox of motion presumably never occurred to Zeno, but it's no exaggeration to say that special relativity vindicates Zeno's skepticism and physical intuition about the nature of motion. He was correct that instantaneous velocity in the context of absolute space and absolute time does not correspond to physical reality, and probably doesn't even make sense. From Zeno's point of view, the classical concept of absolute time was not logically sound, and special relativity (or something like it) is a logical necessity, not just an empirical fact. It's even been suggested that if people had taken Zeno's paradoxes more seriously they might have arrived at something like special relativity centuries ago, just on logical grounds. This suggestion goes back at least to Minkowski's famous lecture of "staircase wit" (see Section 1.7). Doubtless it's stretching the point to say that Zeno anticipated the theory of special relativity, but it's undeniably true that his misgivings about the logical consistency of motion in it's classical form were substantially justified. The universe does not (and arguably, could not) work the way people thought it did.

 

In all four of Zeno's arguments on motion, the implicit point is that if space and time are independent, then logical inconsistencies arise regardless of whether the physical world is continuous or discrete. All of those inconsistencies can be traced to the implication that, if any motion is possible, then the range of conceivable relative velocities must be unbounded, corresponding to Minkowski's "unintelligible" G∞.

 

What is the alternative? Zeno considers the premise that the range of possible relative velocities is bounded, i.e., there is some maximum achievable (conceivable) relative velocity, and he associates this possibility with the idea that space and time are not infinitely divisible. (It presumably didn't occur to him that another way of achieving this is to assume space and time are not independent.)

 

This brings us to the last of Zeno's four main arguments on motion, "The Stadium", which has always been the most controversial, partly because the literal translation of its statement is somewhat uncertain. In this argument Zeno appears to be attacking the only remaining alternative to the unintelligible G∞, namely, the possibility of a finite upper bound on conceivable velocity. It's fascinating that he argues in much the same way that modern students do when they're first introduced to the concept of an invariant speed in the theory of special relativity. He says, in effect, that if someone is running towards me from the west at the maximum possible speed, and someone else is approaching me from the east at the maximum possible speed, then they are approaching each other at twice the maximum possible speed...which is a contradiction.

 

To illustrate the relevance of Zeno's arguments to a discussion of the consequences of special relativity, compare the discussion of time dilation in Section 2.13 of Rindler's "Essential Relativity" with Heath's review of Zeno's Stade paradox in Chapter VIII of "A History of Greek Mathematics". The resemblance is so striking that it's tempting to imagine that either Rindler consciously patterned his discussion on some recollection of Zeno's argument, or it's an example of Jung's collective unconscious. Here is a reproduction of Rindler's Figure 2.4, showing three "snapshots of two sequences of clocks A, B, C,... and A', B', C', ... fixed at certain equal intervals along the x axes of two frames S and S':

 

[pic]

 

These three snapshots are taken at equal intervals by an observer in a third frame S", relative to which S and S' have equal and opposite velocities. Rindler describes the values that must appear on each clock in order to explain the seemingly paradoxical result that each observer considers the clocks of the others to be running slow, in accord with Einsteinian relativity. Compare this with the figure on page 277 of Heath:

 

[pic]

 

where again we have three snapshots of a sequence of clocks (i.e., observers/athletes), this time showing the reference frame S" as well as the two frames S and S' that are moving with equal and opposite velocities relative to S". As Aristotle commented, this scenario evidently led Zeno to the paradoxical conclusion that "half the time is equal to its double", precisely as the freshman physics student suspects when he first considers the implications of relativity.

 

Surely we can forgive Zeno for not seeing that his arguments can only be satisfactorily answered - from the standpoint of physics - by assuming Lorentzian invariance and the relativity of space and time. According to this view, with it's rejection of absolute simultaneity, we're inevitably led from a dynamical model in which a single slice of space progresses "evenly and equably" through time, to a purely static representation in which the entire history of each worldline already exists as a completed entity in the plenum of spacetime. This static representation, according to which our perceptions of change and motion are simply the product of our advancing awareness, is strikingly harmonious with the teachings of Parmenides, whose intelligibility Zeno's arguments were designed to defend.

 

Have we now finally resolved Zeno's "youthful effort"? Given the history of "final resolutions", from Aristotle onwards, it's probably foolhardy to think we've reached the end. It may be that Zeno's arguments on motion, because of their simplicity and universality, will always serve as a kind of "Rorschach image" onto which people can project their most fundamental phenomenological concerns (if they have any).

 

3.8  A Very Beautiful Day

 

Such a solemn air of silence has descended between us that I almost feel as if I am committing a sacrilege when I break it now with some inconsequential babble. But is this not always the fate of the exalted ones of this world?

                                                                                                Einstein to Habicht, 25 May 1905

 

In 1894 Einstein's parents and younger sister Maja moved to Italy, where his father hoped to start a new business. It was arranged for Albert, then 15, to remain in Munich to complete his studies at the Gymnasium (high school), but the young lad soon either dropped out or was invited to leave (recollections differ). He then crossed the Alps to reunite with his family in Italy. Lacking a high school diploma, his options for further education were limited, but his father still hoped for him to become an electrical engineer, which required a university degree. It so happens that the Zurich Polytechnic Institute had an unusual admissions policy which did not require a high school diploma, provided the applicant could pass the entrance examination, so after a year off in Italy, the 16 year old Albert was dispatched to Zurich to take the exam. He failed, having made (as he later admitted) "no attempt whatsoever to prepare myself". In fairness, it should be noted that the usual age for taking the exam was 18, but it seems he wasn't particularly eager to (as his father advised) "forget his philosophical nonsense and apply himself to a sensible trade like electrical engineering".

 

Fortunately, the principal of the Polytechnic noted the young applicant's unusual strength in mathematics, and helped make arrangements for Einstein to attend a cantonal school in the picturesque town of Aarau, twenty miles west of Zurich. The headmaster of the school was Professor Jost Winteler, an ornithologist. During his time in Aarau Einstein stayed with the Winteler family, and always had fond memories of the time he spent there, in contrast with what he regarded as the coercive atmosphere at the Munich Gymnasium. He became romantically involved with Marie Winteler (Jost's daughter), but seems to have been less serious about it than she was, and the relationship ended badly when Einstein took up with Mileva Maric. He also formed life-long relationships with two of the other Winteler children, Paul and Anna. Paul Winteler married Einstein's sister Maja, and Anna Winteler married Michelangelo Besso, one of Einstein's closest friends. Besso, six years older than Einstein, was a Swiss-Italian studying to be an electrical engineer. Like Einstein, he played the violin, and the two of them first met at a musical gathering in 1896.

 

It was just a year earlier the 16 year old Einstein first wondered how the world would appear to someone traveling at the speed of light. He realized that to such an observer a co-moving lightwave in a vacuum would appear as a spatially fluctuating standing wave, i.e., a stationary wave of light, but it doesn't take an expert in Maxwell's equations to be skeptical that any such configuration is possible. Indeed, Einstein later recalled that "from the beginning it appeared to me intuitively clear" that light must propagate in the same way with respect to any system of inertial coordinates. However, this invariance directly contradicts the Galilean addition rule for the composition of velocities. This problem stayed with Einstein for the next ten years, during which time he finally gained entrance to the Polytechnic, and, to the disappointment of his family, switched majors from electrical engineering to physics. His friend Besso continued with his studies and became an electrical engineer in Milan. Already by this time Einstein had turned from engineering to pure physics, and seems to have decided (or foreseen) how he would spend his life, as he wrote in an apologetic letter to Marie’s mother Pauline Winteler in the Spring of 1897

 

Strenuous intellectual work and looking at God’s Nature are the reconciling, fortifying, yet relentlessly strict angels that shall lead me through all of life’s troubles… And yet what a peculiar way this is to weather the storms of life – in many a lucid moment I appear to myself as an ostrich who buries his head in the desert sand so as not to perceive the danger. One creates a small little world for oneself, and as lamentably insignificant as it may be in comparison with the perpetually changing size of real existence, one feels miraculously great and important…

 

Despite his love of physics, Einstein did not perform very impressively as an under-graduate in an academic setting, and this continued to be true in graduate school.  Hermann Minkowski referred to his one-time pupil as a "lazy dog". As the biographer Clark wrote, "Einstein became, as far as the professorial staff of the ETH was concerned, one of the awkward scholars who might or might not graduate but who in either case was a great deal of trouble". Professor Pernet at one point suggested to Einstein that he switch to medicine or law rather than physics, saying "You can do what you like, I only wish to warn you in your own interest". Clearly Einstein "pushed along with his formal work just as much as he had to, and found his real education elsewhere". Often he didn't even attend the lectures, relying on Marcel Grossman's notes to cram for exams, making no secret of the fact that he wasn't interested in what men like Weber had to teach him. His main focus during the four years while enrolled at the ETH was independently studying the works of Kirchhoff, Helmholtz, Hertz, Maxwell, Poincare, etc., flagrantly outside the course of study prescribed by the ETH faculty. Some idea of where his studies were leading him can be gathered from a letter to his fellow student and future wife Mileva Maric written in August of 1899

 

I returned to the Helmholtz volume and am at present studying again in depth Hertz’s propagation of electric force. The reason for it was that I didn’t understand Helmholtz’s treatise on the principle of least action in electrodynamics. I am more and more convinced that the electrodynamics of moving bodies, as presented today, is not correct, and that it should be possible to present it in a simpler way. The introduction of the term “ether” into the theories of electricity led to the notion of a medium of whose motion one can speak without being able, I believe, to associate a physical meaning with this statement. I think that the electric forces can be directly defined only for empty space…

 

Einstein later recalled that after graduating in 1900 the "coercion" of being forced to take the final exams "had such a detrimental effect that... I found the consideration of any scientific problem distasteful to me for an entire year". He achieved an overall mark of 4.91 out of 6, which is rather marginal. Academic positions were found for all members of the graduating class in the physics department of the ETH with the exception of Einstein, who seems to have been written off as virtually unemployable, "a pariah, discounted and little loved", as he later said.

 

From Milan in late August of 1900 Einstein wrote to his girlfriend, Mileva, and mentioned that

 

I am spending many evening’s here at Michele’s. I like him very much because of his sharp mind and his simplicity, and also Anna and, especially, the little brat. His house is simple and cozy, even though the details show some lack of taste…

 

In another letter to Mileva, in October, he commented that his friend had intuited the blossoming romance between Einstein and Mileva (who had studied physics together at the Polytechnic)

 

Michele has already noticed that I like you, because, even though I didn’t tell him almost anything about you, he said, when I told him that I must now go the Zurich again: “He surely wants to go to his [woman] colleague, what else would draw him to Zurich?” I replied “But unfortunately she is not there yet”. I prodded him very much to become a professor, but I doubt very much that he’ll do it. He simply doesn’t want to let himself and his family be supported by his father. This is after all quite natural. What a waste of his truly outstanding intelligence.

 

Despite Einstein’s consistently high appraisal of Besso’s intelligence, there was another aspect to Besso’s personality – a certain addle-brained quality – that seemed to amuse Einstein, but that others sometimes found alarming. In March of 1901 Einstein recounted a story about his friend Michele in a letter to Mileva:

 

On the evening of the day before yesterday, Michele's director, with whom we are rather well acquainted, was at our house for music making. He said how totally unusable and almost mentally incompetent [not responsible for his actions in a legal sense] Michele is, despite his extraordinarily extensive knowledge. Most delectable is the following little story…Once again, Michele had nothing to do, so his principal sends him to the Casale power Station to inspect and check the newly installed lines.  Our hero decides to leave in the evening, to save valuable time, of course, but unfortunately he missed the train. The next day he remembered the commission too late. On the third day he went to the train on time, but realized, to his horror, that he no longer knew what he had been requested to do; so he immediately wrote a postcard to the Office, asking that they should wire him what he was supposed to do!!  I think the man is not normal.

 

In another “love letter” to Mileva in April, Einstein wrote about having just read Planck’s paper on radiation “with mixed feelings”, because “misgivings of a fundamental nature have arisen in my mind”. In the same letter he wrote

 

Michele arrived with wife and child from Trieste the day before yesterday.  He is an awful weakling without a spark of healthy humaneness, who cannot rouse himself to any action in life or scientific creation, but an extraordinarily fine mind, whose working, though disorderly, I watch with great delight.  Yesterday evening I talked shop with him with great interest for almost 4 hours.  We talked about the fundamental separation of luminiferous ether and matter, the definition of absolute rest, molecular forces, surface phenomena, dissociation.  He is very interested in our investigations, even though he often misses the overall picture because of petty considerations.  This is inherent in the petty disposition of his being, which constantly torments him with all kinds of nervous notions.

 

Toward the end of 1901 Einstein had still found no permanent position.  As he wrote to Grossman in December of that year, "I am sure I would have found a position [by now] were it not for Weber's intrigues against me". It was only because Grossman's father happened to be good friends with Haller, the chief of the Swiss Patent Office, that Einstein was finally given a job, despite the fact that Haller judged him to be "lacking in technical training". Einstein wrote gratefully to the Grossman's that he "was deeply moved by your devotion and compassion which do not let you forget an old, unlucky friend", and that he would spare no effort to live up to their recommendation.  He had applied for Technical Expert 2nd class, but was given the rank of 3rd class (in June 1902). 

 

As soon as he'd been away from the coercive environment of academia long enough that he could stand once again to think about science, he resumed his self-directed studies, which he pursued during whatever free time a slightly lazy patent examiner can make for himself. His circumstances were fairly unusual for someone working on a doctorate, especially since he'd already been rejected for academic positions by both the ETH and the University of Zurich. He was undeniably regarded by the academic community (and others) as "an awkward, slightly lazy, and certainly intractable young man who thought he knew more than his elders and betters". 

 

In early 1905, while employed as a patent examiner in Bern, Einstein was striving to complete his doctoral thesis, focusing on black-body radiation, and at the same time writing a paper on light-quanta (later cited by the Nobel committee) and another on Brownian motion, each of which was a significant contribution to 20th century physics. After completing these papers he turned his attention once again to the "philosophical nonsense" of the velocity addition problem, which he realized was "a puzzle not easy to solve at all", not least because his ideas about light quanta had made it clear to him that Maxwell’s equations could not claim absolute validity, so there were no clear foundations on which to build.

 

After completing the statistical and light quanta papers on March 17, April 30, and May 10, 1905, he allowed himself to concentrate fully on the problem of motion, which apparently had never been far from his mind. As he later recalled, he "felt a great difficulty to resolve the question... I had wasted time almost a year in fruitless considerations..." Then came the great turning point, both for Einstein's own personal life and for modern physics: "Unexpectedly, a friend of mine in Bern then helped me." The friend was Michelangelo Besso, who had by then also taken a job at the Swiss patent office. In his Kyoto lecture of 1922 Einstein later remembered the circumstances of the unexpected help he received from Besso:

 

That was a very beautiful day when I visited him and began to talk with him as follows:  "I have recently had a question which was difficult for me to understand.  So I came here today to bring with me a battle on the question."  Trying a lot of discussions with him, I could suddenly comprehend the matter.  Next day I visited him again and said to him without greeting "Thank you.  I've completely solved the problem." 

 

It had suddenly become clear to Einstein during his discussion with Besso that the correlation of time at different spatial locations is not absolutely defined, since it depends fundamentally on some form of communication between those locations. Thus, the concept of simultaneity at separate locations is relative.  A mere five weeks after this recognition, Einstein completed "On the Electrodynamics of Moving Bodies", in which he presented the special theory of relativity. This monumental paper contains not a single reference to the literature, and only one acknowledgement:

 

In conclusion, I wish to say that in working at the problem here dealt with I have had the loyal assistance of my friend and colleague M. Besso, and that I am indebted to him for several valuable suggestions.

 

We don't know precisely what those suggestions were, but we have Einstein's later statement that he "could not have found a better sounding board for his ideas in all of Europe." It was Besso also introduced Einstein to the writings of Ernst Mach, which were to have such a profound influence on the development of the general theory (although subsequently Einstein emphasized the influence of Hume over Mach). Besso self-deprecatingly described their intellectual relationship by saying "Einstein the eagle took Besso the sparrow under his wing, and the sparrow flew a little higher". The two men carried on a regular correspondence that lasted over half a century, through two world wars, and Einstein's incredible rise to world fame. It’s interesting that, despite how highly Einstein valued Besso’s intellect, the latter invariably took a self-denigrating tone in their correspondence (and presumably in their conversations), sometimes even seeming to be genuinely puzzled by the significance that Einstein attached to his “little” comments. In a letter of August 1918 Besso wrote

 

You had, by the way, overestimated the meaningfulness of my observations again: I was not aware that they had the meaning that an energy tensor for gravitation was dispensable. If I understand it correctly, my inadvertent statement now implies that planetary motion would satisfy conservation laws just by chance, as it were. What is certain is that I was not aware of this consequence of my comments and cannot grasp the argument even now.

 

The friendship with Besso may have been, in some ways, the most meaningful of Einstein's life. Michael and his wife sometimes took care of Einstein's children, tried to reconcile Einstein with Mileva when their marriage was foundering, and so on. Another of the few close personal ties that Einstein was able to maintain over the years was with Max von Laue, who Einstein believed was the only one of the Berlin physicists who behaved decently during the Nazi era. Following the war, a friend of Einstein's was preparing to visit Germany and asked if Einstein would like him to convey any messages to his old friends and colleagues. After a moment of thought, Einstein said "Greet Laue for me". The friend, trying to be helpful, then asked specifically about several other individuals among Einstein's former associates in his homeland. Einstein thought for another moment, and said "Greet Laue for me".

 

The stubborn, aloof, and uncooperative aspect of Einstein's personality that he had shown as a student continued to some extent throughout his life. For example, in 1937 he collaborated with Nathan Rosen on a paper purporting to show, contrary to his own prediction of 1916, that gravitational waves cannot exist - at least not without unphysical singularities. He submitted this paper to Physical Review, and it was returned to him with a lengthy and somewhat critical referee report asking for clarifications. Apparently Einstein was unfamiliar with the refereeing of papers, routinely practiced by American academic journals. He wrote back to the editor

 

Dear Sir,

We (Mr. Rosen and I) had sent you our manuscript for publication and had not authorized you to show it to specialists before it is printed. I see no reason to address the - in any case erroneous - comments of your anonymous expert. On the basis of this incident I prefer to publish the paper elsewhere.

respectfully,

P.S. Mr. Rosen, who has left for the Soviet Union, has authorized me to represent him in this matter.

 

Was the postscript about Mr. Rosen's departure to the Soviet Union (in the politically charged atmosphere of the late 1930's) an oblique jibe at American mores, or just a bland informational statement?  In any case, Einstein submitted the paper, unaltered, to another journal (The Journal of the Franklin Institute). However, before it appeared he came to realize that its argument was faulty, causing him to re-write the paper and its conclusions. Interestingly, what Einstein had realized is precisely what the anonymous referee had pointed out, namely, that by a change of coordinates the construction given by Einstein and Rosen was simply a description of cylindrical waves, with a singularity only along the axis (thus considered to be an acceptable singularity). The referee report still exists among Einstein's private papers, although it isn't clear if the correction was prompted by the Physical Review's referee report. (The correction may also have been prompted by private comments from Howard Percy Robertson (via Infeld) who had just returned to Princeton from sabbatical. On the other hand, these two possibilities may amount to the same thing, since Kennefick speculates that Robertson was the anonymous referee!)

 

Another aspect of Einstein's personality that seems incongruous with scholarly success was his remarkable willingness to make mistakes in public and change his mind about things, with seemingly no concern for the effect this might have on his academic credibility.  Regarding the long succession of "unified field theories" that Einstein produced in the 1920's and 30's, Pauli commented wryly "It is psychologically interesting that for some time the current theory is usually considered by its author to be the 'definitive solution'". Eventually Einstein gave up on the particular approach to unification that he had been pursuing in those theories, and cheerfully wrote to Pauli "You were right after all, you rascal". Lest we think that this willingness to make and admit mistakes was a characteristic only of the aged Einstein, past his prime, recall Einstein's wry self-description in a letter to Ehrenfest in December 1915: "That fellow Einstein suits his convenience. Every year he retracts what he wrote the year before."

 

In 1939 Einstein's sister Maja Winteler, was forced by Mussolini's racial policies to leave Florence. She went to Princeton to join her brother while Paul moved in with his sister Anna and Michele Besso's family in Geneva.  Ironically, twenty years earlier Einstein had confided in a letter to Besso that

 

Trouble is brewing between Maja and Paul. They ought to divorce as well. Paul is supposedly having an affair and the marriage is quite in pieces. One shouldn’t wait too long (like I did). It only does you in for no reason at all! Talk to them both some time when you see them. No mixed marriages are any good. (Anna says: oh!)

 

Apparently Paul had heard the rumors of marital strife and wrote to assure Einstein that “we have been living very harmoniously, as before”. But after their forced separation in 1939 Maja and Paul never saw each other again. In 1946, after the war, they began making plans to reunite in Geneva, but Maja suffered a stroke, and thereafter remained bedridden until her death in 1951. To Besso in 1954, nearly 50 years after their discussion in the patent office, Einstein wrote:

 

I consider it quite possible that physics cannot be based on the field principle, i.e., on continuous structures.  In that case, nothing remains of my entire castle in the air, gravitation theory included..."

 

In March of the following year, Michelangelo Besso died at his home in Geneva. Einstein wrote to the Besso family "Now he has gone a little ahead of me in departing from this curious world". Einstein died three weeks later, on April 18, 1955.

3.9  Constructing the Principles

 

In mechanics as reformed in accordance with the world-postulate, the disturbing lack of harmony between Newtonian mechanics and modern electrodynamics disappears of its own accord.

                                                                                                                H. Minkowski, 1907

 

The general public took little notice of the special theory of relativity when it first appeared 1905, but following the sensational reports of the eclipse observations of 1919 Einstein instantly became a world-wide celebrity, and there was suddenly intense public interest in everything having to do with “Einstein’s theory”. The London Times asked him to explain his mysterious theory to its readers. He accommodated with a short essay that is notable for its description of what he regarded as two fundamentally different kinds of physical theories. He wrote:

 

We can distinguish various kinds of theories in physics. Most of them are constructive. They attempt to build up a picture of the more complex phenomena out of the materials of a relatively simple formal scheme from which they start out. Thus the kinetic theory of gases seeks to reduce mechanical, thermal, and diffusional processes to movements of molecules -- i.e., to build them up out of the hypothesis of molecular motion. When we say that we have succeeded in understanding a group of natural processes, we invariably mean that a constructive theory has been found which covers the processes in question.

 

Along with this most important class of theories there exists a second, which I will call "principle-theories." These employ the analytic, not the synthetic, method. The elements which form their basis and starting-point are not hypothetically constructed but empirically discovered ones, general characteristics of natural processes, principles that give rise to mathematically formulated criteria which the separate processes or the theoretical representations of them have to satisfy.  Thus the science of thermodynamics seeks by analytical means to deduce necessary conditions, which separate events have to satisfy, from the universally experienced fact that perpetual motion is impossible.

 

The advantages of the constructive theory are completeness, adaptability, and clearness, those of the principle theory are logical perfection and security of the foundations.  The theory of relativity belongs to the latter class.

 

Einstein was not the first to discuss such a distinction between physical theories. In an essay on the history of physics included in the book “The Value of Science” published in 1904, Poincare had described how, following Newton’s success with celestial mechanics, the concept of central forces acting between material particles was use almost exclusively as the basis for constructing physical theories (the exception being Fourier’s theory of heat). Poincare expressed an appreciation for this constructive approach to physics.

 

This conception was not without grandeur; it was seductive, and many among us have not finally renounced it; they know that one will attain the ultimate elements of things only by patiently disentangling the complicated skein that our senses give us; that it is necessary to advance step by step, neglecting no intermediary; that our fathers were wrong in wishing to skip stations; but they believe that when one shall have arrived at these ultimate elements, there again will be found the majestic simplicity of celestial mechanics.

 

Poincare then proceded to a section called “The Physics of Principles”, where he wrote:

 

Nevertheless, a day arrived when the conception of central forces no longer appeared sufficient… What was done then? The attempt to penetrate into the detail of the structure of the universe, to isolate the pieces of this vast mechanism, to analyse one by one the forces which put them in motion, was abandoned, and we were content to take as guides certain general principles, the express object of which is to spare us this minute study… The principle of the conservation of energy… is certainly the most important, but it is not the only one; there are others from which we can derive the same advantage. These are: Carnot's principle, or the principle of the degradation of energy. Newton's principle, or the principle of the equality of action and reaction. The principle of relativity, according to which the laws of physical phenomena must be the same for a stationary observer as for an observer carried along in a uniform motion of translation… The principle of the conservation of mass… The principle of least action. The application of these five or six general principles to the different physical phenomena is sufficient for our learning of them all that we could reasonably hope to know of them… These principles are results of experiments boldly generalized; but they seem to derive from their very generality a high degree of certainty. In fact, the more general they are, the more frequent are the opportunities to check them, and the verifications multiplying, taking the most varied, the most unexpected forms, end by no longer leaving place for doubt… Thus they came to be regarded as experimental truths; the conception of central forces  became then a useless support, or rather an embarrassment, since it made the principles partake of its hypothetical character.

 

Einstein is known to have been an avid reader of Poincare’s writings, so it seems likely that he adopted the theoretical classification scheme from this essay.

 

Returning to the previous excerpt from Einstein’s article, notice that he actually mentions three sets of alternative characteristics, all treated as representing essentially the same dichotomy. We're told that constructive theories proceed synthetically on the basis of hypothetical premises, whereas principle theories proceed analytically on the basis of empirical premises. Einstein cites statistical thermodynamics as an example of a constructive theory, and classical thermodynamics as an example of a principle theory.  His view of these two different approaches to thermodynamics was undoubtedly influenced by the debate concerning the reality of atoms, which Mach disdainfully called the "atomistic doctrine". The idea that matter is composed of finite irreducible entities was regarded as purely hypothetical, and the justification for this hypothesis was not entirely clear.  In fact, Einstein himself spent a great deal of time and effort trying to establish the reality of atoms, e.g., this was his expressed motivation for his paper on Brownian motion. Within this context, it's not surprising that he classified the premises of statistical thermodynamics as purely hypothetical, and the development of the theory as synthetic. 

 

However, in another sense, it could be argued that the idea of atoms actually arises empirically, and represents an extreme analytic approach to observed phenomena. Literally the analytic method is to "take apart" the subject into smaller and smaller sub-components, until arriving at the elementary constituents. We regard macroscopic objects not as an indivisible wholes, but as composed of sub-parts, each of which is composed of still smaller parts, and we continue this process of analysis at least until we can no longer directly resolve the sub-parts (empirically) into smaller entities. At this point we may resort to some indirect methods of inference in order to carry on the process of empirical analysis. Indeed, Einstein's work on Brownian motion did exactly this, in so far as he was attempting to analyze the smallest directly observable entities, and to infer, based on empirical observations, an even finer level of structure. It was apparently Einstein's view that, at this stage, a reversal of methodology is required, because direct observation no longer provides unique answers, and thus the inferences are necessarily indirect, i.e., they can only be based on a somewhat free hypothesis about the underlying structure, and then synthetically working out the observable implications of this hypothesis and comparing these with what we actually observe. 

 

So Einstein's conception of a constructive (hypothetically based, synthetic) physical theory was of a theory arrived at by hypothesizing or postulating some underlying structure (consistent with all observations, of course), and then working out the logical consequences of those postulates to see how well they account for the whole range of observable phenomena. At this point we might expect Einstein to classify special relativity as a constructive theory, because it's well known that the whole theory of special relativity - with all its observable consequences - can be constructed synthetically based on the exceedingly elementary hypothesis that the underlying structure of space and time is Minkowskian. However, Einstein's whole point in drawing the distinction between constructive and principle theories was to argue that relativity is not a constructive theory, but is instead a theory of principle.

 

It's clear that Einstein's original conception of special relativity was based on the model of classical thermodynamics, even to the extent that he proposed exactly two principles on which to base the theory, consciously imitating the first and second laws of thermodynamics. Some indication of the ambiguity in the classification scheme can be seen in the various terms that Einstein applied to these two propositions. He variously referred to them as postulates, principles, stipulations, assumptions, hypotheses, definitions, etc. Now, recalling that a "constructive theory" is based on hypotheses, whereas a "principle theory" is based on principles, we can see that the distinction between principles and postulates (hypotheses) is significant for correctly classifying a theory, and yet Einstein was not very careful (at least originally) to clarify the actual role of his two foundational propositions.

 

Nevertheless, his consistently viewed special relativity as a theory of principle, with the invariance of light speed playing a role analogous to the conservation of energy in classical thermodynamics, both regarded as high-level empirical propositions rather than low-level elementary hypotheses.  Indeed, it's possible to make this more than just an analogy, because in place of the invariance of light speed (with respect to all inertial coordinate systems) we could just as well posit conservation of total mass-energy (with the conversion E = mc2), and use this conservation, together with the original principle of relativity (essentially carried over from Newtonian physics), as the basis for special relativity.  As late as 1949 in his autobiographical notes (which he jokingly called his "obituary"), Einstein wrote

 

Gradually I despaired of the possibility of discovering the true laws by means of constructive efforts based on known facts.  The longer and more desperately I tried, the more I came to the conviction that only the discovery of a universal formal principle could lead us to assured results.  The example I saw before me was thermodynamics.  The general principle was there given in the theorem: the laws of nature are such that it is impossible to construct a perpetuum mobile (of the first or second kind)... The universal principle of the special theory of relativity is contained in the postulate: The laws of physics are invariant with respect to Lorentz transformations (for the transition from one inertial system to any other arbitrarily chosen inertial system).  This is a restricting principle for natural laws, comparable to the restricting principle of the nonexistence of the perpetuum mobile that underlies thermodynamics.

 

Here Einstein refers to "constructive theories based on known facts", whereas in the 1919 article he indicated that constructive theories are based on "a relatively simple formal scheme" such as the hypothesis of molecular motion (i.e., the atomistic doctrine that Mach (for one) rejected as unempirical), and principle theories are based on empirical facts.  In other words, the distinguishing characteristics that Einstein attributed to the two kinds of theories have been reversed. This illustrates one of the problematic aspects of Einstein's classification scheme: every theory is ultimately based on some unprovable premises, and at the same time every (nominally viable) theory is based on what might be called known facts, i.e., is it connected to empirical results. Einstein was certainly well aware of this, as shown by the following comment (1949) in a defense of his methodological approach:

 

A basic conceptual distinction, which is a necessary prerequisite of scientific and pre-scientific thinking, is the distinction between "sense-impressions" (and the recollection of such) on the one hand and mere ideas on the other. There is no such thing as a conceptual definition of this distinction (aside from, circular definitions, i.e., of such as make a hidden use of the object to be defined). Nor can it be maintained that at the base of this distinction there is a type of evidence, such as underlies, for example, the distinction between red and blue. Yet, one needs this distinction in order to be able to overcome solipsism.

 

In view of this, what ultimately is the distinction between what Einstein called constructive theories and principle theories?  It seems that the distinction can only be based on the conceptual level of the hypotheses, so that constructive theories are based on "low level" hypotheses, and principle theories based on "high level" hypotheses.  In this respect the original examples (classical thermodynamics and statistical thermodynamics) cited by Einstein are probably the clearest, because they represent two distinct approaches to essentially the same subject matter. In a sense, they can be regarded as just two different interpretations of a single theory (much as special relativity and Lorentz's ether theory can be seen as two different interpretations of the same theory). Now, statistical thermodynamics was founded on hypotheses - such as the existence of atoms - that may be considered "low level", whereas the hypothesis of energy conservation in classical thermodynamics can plausibly be described as "high level". On the other hand, the premises of statistical thermodynamics include the idea that the molecules obey certain postulated equations of motions (e.g., Newton's laws) which are essentially just expressions of conservation principles, so the "constructive" approach differs from the "theory of principle" only in so far as its principles are applied to very low-level entities. The conservation principles are explicitly assumed only for elementary molecules in statistical thermodynamics, and then they are inferred for high-level aggregates like a volume of gas.  In contrast, the principle theory simply observes the conservation of energy at the level of gases, and adopts it as a postulate.

 

In the case of special relativity, it's clear that Einstein originally developed the theory from a "high-level" standpoint, based on the observation that light propagates at the same speed with respect to every system of inertial coordinates. He himself felt that a constructive model or interpretation for this fact was lacking. In January of 1908 he wrote to Sommerfeld

 

A physical theory can be satisfactory only if its structures are composed of elementary foundations.  The theory of relativity is ultimately just as unsatisfactory as, for example, classical thermodynamics was before Boltzmann interpreted entropy as probability.

 

However, just eight months later, Minkowski delivered his famous lecture at Cologne, in which he showed how the theory of special relativity follows naturally from just a simple fundamental hypothesis about the metric of space and time. There can hardly be a lower conceptual level than this, i.e., some assumption about the metric(s) of space and time is seemingly a pre-requisite for any description - scientific or otherwise - of the phenomena of our experience. Kant even went further, and suggested that one particular metrical structure (Euclidean) was a sina qua non of rational thought. We no longer subscribe to such a restrictive view, and it may even be possible to imagine physical ideas prior to any spatio-temporal conceptions, but nevertheless the fact remains that such conceptions are among the most primitive that we possess. For example, the posited structure of space and time is more primitive than the notion of atoms moving in a void, because we cannot even conceive of "moving in a void" without some idea of the structure of space and time. Hence, if a complete physical theory can be based entirely on nothing other than the hypothesis of one simple form for the metric of space and time, such a theory must surely qualify as "constructive". Minkowski’s spacetime interpretation does for special relativity what Boltzmann’s statistical interpretation did for thermodynamics, namely, it provided an elementary constructive foundation for the theory.

 

Einstein's reaction to Minkowski's work was interesting.  It's well known that Einstein was not immediately very appreciative of his former instructor's contribution, describing it as "superfluous learnedness", and joking that "since the mathematicians have attacked the relativity theory, I myself no longer understand it any more".  He seems to have been at least partly serious when he later said "The people in Gottingen [where both Minkowski and Hilbert resided] sometimes strike me not as if they wanted to help one formulate something clearly, but as if they wanted only to show us physicists how much brighter they are than we".  Of course, Einstein's appreciation subsequently increased when he found it necessary to use Minkowski's conceptual framework in order to develop general relativity.  Still, even in his autobiographical notes, Einstein seemed to downplay the profound transformation of special relativity that Minkowski's insight represents.

 

Minkowski's important contribution to the theory lies in the following: Before Minkowski's investigation it was necessary to carry out a Lorentz transformation on a law in order to test its invariance under Lorentz transformations; be he succeeded in introducing a formalism so that the mathematical form of the law itself guarantees its invariance under Lorentz transformations.

 

In other words, Minkowski's contribution was merely the introduction of a convenient mathematical formalism. Einstein then added, almost as an afterthought,

 

He [Minkowski] also showed that the Lorentz transformation (apart from a different algebraic sign due to the special character of time) is nothing but a rotation of the coordinate system in the four-dimensional space.

 

This is a rather slight comment when we consider that, from the standpoint of Einstein's own criteria, Minkowski's insight that Lorentz invariance is purely an expression of the (pseudo) metric of a combined four-dimensional space-time manifold at one stroke renders special relativity into a constructive theory, the thing for which Einstein had sought so "desperately" for so long. As he wrote in the London Time article above, "when we say that we have succeeded in understanding a group of natural processes, we invariably mean that a constructive theory has been found which covers the processes in question", but he himself had given up on the search for such a theory in 1905, and had concluded that, for the time being, the only possibility of progress was by means of a theory of principle, analogous to classical thermodynamics. Actual understanding of the phenomena would have to wait for a constructive theory. As it happened, this constructive theory was provided just three years later by his former mathematics instructor in Gottingen.

 

From this point of view, it seems fair to say that the modern theory of special relativity has had three distinct forms. First was Lorentz's (and Poincare's) ether theory (1892-1904) which, although conceived as a constructive theory, actually derived its essential content from a set of high-level principles and assumptions as discussed in Section 3.6.  Second was Einstein's explicit theory of principle (1905), in which he identified and isolated the crucial premises underlying Lorentz’s theory, and showed how they could be consistently interpreted as primitive aspects of space and time. Third was Minkowski's explicitly constructive spacetime theory (1908). Each stage represented a significant advance in clarity, with Einstein's intermediate theory of principle and its interpretation serving as the crucial bridge between the two very different constructive frameworks of Lorentz and Minkowski.

 

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download