1. The Problem - Microsoft Azure
嚜澤 Purely Relational Way of Computing Neighbors on a Sphere
Jim Gray
July 2002
Abstract: Relational operators and a zoned index accelerates spatial lookups by about
8-fold and spatial joins by about 32-fold when compared to a technique based on an
external stored procedure.
1. The Problem
o.xyz?xyz
SkyServer applications and queries often need to find all objects nearby a given object in
the celestial sphere. This is such a common operation that we implemented a series of
table-valued functions fGetNearbyObjEq and fGetNearbyObjXyz that return all
objects within a certain radius of a given equatorial point (right ascension and
declination) or a given x,y,z unit vector in the J2000 coordinate system. These functions
use an HTM index to limit the search to about 15 bounding HTM triangles. Then they
filter the objects using the following equation to compute the actual distance (between
object o with celestial coordinates o.cx, o.cy, o.cz and the point x,y,z:
DistanceInDegrees = degrees(2℅asin(sqrt((o.cx-x)2+(o.cy-y)2+ (cz-z)2))/2)). (1)
In fact the SkyServer has a dozen
neighbors functions that return nearby or
sin(牟/2) = |o.xyz?xyz|/2
xyz
nearest objects and some of their attributes
牟/2
牟
每 but all these functions are based on the
o.xyz
core HTM-then-filter algorithm 每 indeed
they all call one core procedure that has
this logic.
Figure 1: construction for equation (1).
xyz and o.xyz are unit vectors.
Some queries want to compare each of
several hundred million objects with all their neighbors. Searchers for gravitational
lenses and for clusters are examples of such queries. To speed these queries the
SkyServer precomputes the Neighbors Table that for each object lists all its neighbors
within 30 arcseconds along with summary attributes (type and isPrimary). This table
averages about 9 neighbors per object; but, some objects have hundreds of neighbors and
some have none.
Computing the neighbors table using the fGetNearbyObjectsXyz() function can take a
long time: on the fifteen million object SDSS early data release, the computation took 56
hours 每 or about 74 objects per second. Fortunately, the computation was done only a
few times in the load process and then used many times in queries. But, it was obvious
that some speedup will be needed as the SDSS database grows twenty-fold over the next
3 years. Indeed, with the SDSS Data Release 1, the database is about to grow six-fold so
the naive computation would take about 2 weeks1. The full dataset would take about 2
months to compute.
1
This performance is measured on a dual 800 MHz Xeon processor with 2GB ram and 12 disks with
150MB/s bandwidth. All later measurements are performed on a 722MHz Pentium III with 0.5 GB of ram
and 1 disk with 10 MB/s bandwidth.
1
The computation is embarrassingly parallel and cpu-bound. Each object*s neighbors can
be computed independently and the computation is cpu-bound. So, the computation
could be accelerated by using multiple processors. A 7-node processor farm could do
the DR1 job in 2 days. But, it makes sense to analyze the problem and look for better
algorithms.
The basic problem is that SQL can evaluate equation (1) at the rate of about 170,000
records per second per cpu (5.6 ?s per row) while the HTM functions run at about 170
records per second per cpu (5.9 ms per row to return the nearest object.) This is a 1,000:1
performance difference. The high costs of the HTM functions is a combination of the
HTM procedures, the expensive linkage to SQL via external stored procedures (a string
interface), and the use of table-valued functions. It appears, based preliminary timing
tests (a thousand calls to itest5() of the sqlExample HTM procedures) that the HTM
code uses about 3 ms and that the other costs (linkage, string conversion, and tablevalued function are in the range of 2ms). It will be fruitful to investigate why the HTM
code consumes so much cpu time 每 but that code is opaque to me. The linkage costs will
be much reduced with the SQLserver integration with C# in 2003.
2. A Zoned-join Accelerates the Near Functions
An alternative is to use SQL operations to limit the search rather than using the HTM
procedures. By doing this we achieve about a three-fold speedup for the table-valued
functions. This same idea when applied to computing the neighbors functions gives
about a 32-fold speedup, tuning a 14-day neighbors-table computation into a 9-hour job.
Figure 2: The division of the sphere into 12 zones (in practice there
are thousands of zones). Two circular neighborhoods are shown, one
inside one zone (minZone=maxZone) and another crossing 3 zones
(minZone+2= maxZone.) The dotted boxes show how the ra filter
and the dec filter further reduce the search. The ra filter needs to be
※expanded§ by 1/cos(dec).
The basic idea is to map the celestial sphere into zones, each zone is a declination stripe
of the sphere with some zoneHeight (see Figure 2). Number the south-pole zone as zone
zero. Then an object with a declination of dec degrees will be in zone:
zoneNumber = floor((dec+90) /zoneHeight)
(2)
There will be ceiling(180/zoneHeight) zones. The following code builds the zone table.
create table zone (zone int, objID bigint, ra float, dec float,
(3)
x float, y float, z float,
primary key (zone, ra, objID))
insert into zone
select floor((dec+90) /zoneHeight), ra, dec, cx, cy, cz
from PhotoObj
If we search for all objects within a certain radius of point (ra, dec) then we need only
look in certain zones, and only in certain parts of each zone. Indeed, to find all objects
2
within radius R of point RA, DEC2, one only consider zones between
maxZone = ceiling ( (DEC+90+R)/zoneHeight)
(4)
minZone =floor( (DEC+90-R)/zoneHeight)
and within these zones one only need consider objects o with
o.ra between RA-R and RA+R
(5a)
(modulo some cos() and mod(360∼) corrections added below (see (5)).
This way of limiting the search is similar to the HTM approach but avoids calling an
external procedure. The primary key on zones makes this lookup very fast, so that the
resulting procedure has the performance given in Figure 3. This incidentally shows that
table-valued functions can be quite fast.
1000
2.00
Rows vs elapsed time
fit is 1.46+2.2e-4*r^2 ms/asec
Relative time vs zone height (asec)
4 minute zone is near optimal
2 & 8 minute are slower
7 asec
15 asec
30 asec
1 amin
2 amin
4 amin
16 amin
1 degree
1.90
1.80
7.5 asec
15 asec
30 asec
1 amin
2 amin
4 amin
64 amin
r^2 fit
10
1.70
time vs best
time (ms)
100
1.60
1.50
1.40
1.30
1.20
1.10
1.00
1
10
100
r (asec)
1000
10
100
r (asec)
1000
Figure 3: At left is time vs. radius for neighbors function for various zone heights. It
suggests that any small zone height is adequate. The graph at right plots time vs. best
time for each radius. It indicates that a zoneHeight of 4 arcminutes is near-optimal and
that it outperforms smaller and larger radii by large factors.
There are some nasty details that need a bit of extra mechanism. The biggest problem is
that the sphere is round, so equations (4) and (5) must be computed modulo 360∼, and
equation 5a must be corrected for the fact that the right-ascension is ※compressed§ by
cos(dec) as it moves away from the equator. So ra should be divided by cos(dec)+汍
where 汍 is a tiny number added to prevent division by zero when dec is ㊣90∼.
Fortunately, equation 4 needs no correction, but equation (5a) should become
o.ra between (ra-R)/ (cos(dec)+汍) and (ra+R)/(cos(dec)+汍)
(5)
One other detail is that when one is near the prime meridian (ra = 0∼ or ra = 360∼) then
the ※other end§ of the range is nearby. This problem is solved adding these neighbors
into the zone as the margins with ra less than zero for the left margin and ra greater than
360∼ for the right margin. The main area is the ? open interval [0∼,360∼) and the margins
must respect this ? open property. Assuming a MaxRadius of 1∼ and epsilon is 1e-6:
2
We assume RA and DEC have been normalized to be in the ranges [0∼, 360∼] and [-90∼,90∼] respectively.
3
Insert into zone
-- right margin, notice +360 on ra
select floor((dec+90) /@zoneHeight), ra+360, dec, x, y, z
from PhotoObj
where ra >= 0
and ra < @MaxRadius/(cos(radians(dec))+@epsilon)
Insert into zone
-- left margin, notice -360 on ra
select floor((dec+90) /@zoneHeight), ra-360, dec, x, y, z
from PhotoObj
where ra >= 360每@MaxRadius/(cos(radians(dec))+@epsilon)
and ra < 360
Now, equation (5) is actually correct and finds all neighbors within the zone. The full
query to select the neighbors within @r of @ra and @dec from a zone is:
select o1.objID
from zone o1
where o1.zoneID = @zoneID
-- force the zone
-- using zone number and ra interval
-- scaled by cos(abs(dec)) (positive cos)
and o1.ra between @ra - @r/(cos(radians(abs(@dec)))+ @notZero)
and @ra + @r/(cos(radians(abs(@dec)))+ @notZero)
and o1.dec between @dec-@r -- quick filter on dec
and @dec+@r
and (2*degrees(asin(sqrt(
power(o1.cx-@cx,2)+power(o1.cy-@cy,2)+power(o1.cz-@cz,2) )/2)))
< @r -- careful filter on distance
This statement combined with the minZone and maxZone logic of equation (4) gives the
performance described in Figure 3 for a table-valued function finding neighbors nearby a
point. This is the 7x speedup over the HTM external procedures. The full statement
handling this zone logic is:
select o1.objID
from zone o1
where o1.zoneID between
-- force the zone
-- using zone number and ra interval
floor((@dec+90-@r)/@zoneHeight) and
floor((@dec+90+@r)/@zoneHeight)
and o1.ra between @ra - @r/(cos(radians(abs(@dec)))+ @notZero)
and @ra + @r/(cos(radians(abs(@dec)))+ @notZero)
and o1.dec between @dec-@r -- quick filter on dec
and @dec+@r
and (2*degrees(asin(sqrt(
power(o1.cx-@cx,2)+power(o1.cy-@cy,2)+power(o1.cz-@cz,2) )/2)))
< @r -- careful filter on distance
Expressed in this format rather than a nested loop
exactly specifying each zone and ra range, the
optimizer is unlikely to pick the right plan.
One can further accelerate the test by observing that
@ra-@r and @ra+@r is too ※fat§ a band for any zone
except the one holding the center of the circle (see
Figure 4 .
These changes turn the 2-week build-the-neighbors
job into a 2-day job. But, there is one more speedup.
r
ra-zoneMax
x
zoneMax
Ra ㊣ x
Figure 4: when a circle crosses multiple
zones, the ra range of the adjacent zones can
be reduced from r to x as shown in the
diagram. For higher zones, zoneMin replaces
zoneMax.
4
3. A Zoned-Nested-Loops-Join Accelerates Building Neighbors
When one wants to batch-compute many neighbors, the zone approach can bypass the
stored procedure and get another 10-fold speedup (or more if parallelism is used) as
follows. We can join each zone with itself and then with its north and south neighbor
zones. These three joins all use the relational operators with automatic parallelism and
with some very sophisticated optimizations. This bypasses much of the transact-SQL
logic in the original algorithm.
The basic join to compute the neighbors is:
insert neighbors
-- insert one zone's neighbors
select o1.objID as objID,
o2.objID as NeighborObjID,
.. other fields elided
from zone o1 inner loop join zone o2
-- force a nested loop
on o1.zoneID-@deltaZone = o2.zoneID -- using zone number and ra
and o2.ra between o1.ra - @r/(cos(radians(abs(o1.dec)))+ @notZero)
and o1.ra + @r/(cos(radians(abs(o1.dec)))+ @notZero)
where (
o1.ra >= 0 and o1.ra < 360
-- objects not both marginal
or o2.ra >= 0 and o2.ra < 360 )
and o1.objID < o2.objID
-- do 1/2 the work now
and o2.dec between o1.dec-@r and o1.dec+@r -- quick filter on dec
and (2*DEGREES(ASIN(sqrt(
power(o1.cx-o2.cx,2)+power(o1.cy-o2.cy,2)+power(o1.cz-o2.cz,2)
)/2))) < @r
-- careful filter on distance
this is done for @deltaZone in {-1, 0, 1}. The insert-join above does only ? the work,
finding only objects where o1.objID ................
................
In order to avoid copyright disputes, this page is only a partial summary.
To fulfill the demand for quickly locating and searching documents.
It is intelligent file search solution for home and business.
Related download
- vendor microsoft exam code 70 464 exam name developing
- transact sql user defined functions novick software
- sql functions junagadh
- microsoft sql training
- user defined financial functions for ms sql server
- example codes to create clr udt and clr table valued
- microsoft sql server performance tuningfor microsoft
- database modeling and design
- 1 the problem microsoft azure
- cs143 sql query 3
Related searches
- significance of the problem example
- statement of the problem in research
- the problem with philosophy bertrand russell
- statement of the problem template
- the problem of philosophy pdf
- microsoft azure revenue
- statement of the problem examples
- the problem with philosophy
- the problem with high school
- the problem with jews
- what is the problem with command economies
- statement of the problem example