Browsing Documents on a Dense Embedding that Captures ...
嚜濁rowsing Documents on a Dense Embedding that Captures
Theme Evolution
Alessandro Perina
Microsoft Research
One Microsoft Way, Redmond, WA
alperina@
ABSTRACT
We describe a new interaction strategy for browsing documents consisting of text and images. The browser represents
a collection of documents as a grid of key words with varying font sizes that indicate the words* weights. The grid is
computed using the counting grid model [7], so that each
document approximately matches in its word usage the word
weight distribution in some window (6 ℅ 6 in our experiments) in the grid. In comparison to other document embedding approaches, this strategy leads to denser packing of documents and higher relatedness of nearby documents: The two
documents that map to overlapping windows literally share
the words found in the overlap. This leads to smooth thematic
shifts that can provide connections among distant topics on
the grid. The images are embedded into the appropriate locations in the grid, so that a mouse over any location can invoke
a pop-up of the images mapped nearby. Once the user locks
on an interesting spot in the grid, the summaries of the actual
documents that mapped in the vicinity are listed for selection.
In this document browser the arrangement of related words
and themes on the grid naturally guides the user*s attention to
topics of interest. For an illustration we describe and demonstrate (in video submission) a browser of four months of CNN
news.
INTRODUCTION
Summarizing, visualizing and browsing text corpora are
important problems in computer-human interaction. As the
data becomes more massive, ambiguous, or conflicting, it
may become hard for people to glean insights from it. To
help the users, researchers have developed several visual
analytics tools facilitating the analysis of such corpora.
These tools are used to interactively make sense of complex
datasets, a process referred to as sensemaking [10].
We describe a new approach to browsing documents consisting of text and images, e.g. news stories on the web, social
media, special interest web sites, etc. The browsing through
documents is based on the exploration of the hidden variable
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, or
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
CHI*13, April 27每May 2, 2013, Paris, France.
Copyright 2013 ACM 978-1-XXXX-XXXX-X/XX/XX...$10.00.
Nebojsa Jojic
Microsoft Research
One Microsoft Way, Redmond, WA
jojic@
of the on the counting grid (CG) generative model [7], which
has recently been used for a variety of tasks related to regression and classification. The counting grid model represents
the space of possible documents as a grid of word counts.
Each individual document is mapped to a window into this
grid so that the tally of these counts approximately matches
the word counts in the document. The grid can vary in size,
and so can the window. As the documents are allowed to be
mapped with overlap, in order to maximize the likelihood of
the data, the learning algorithm has to map similar documents
to nearby locations in the grid, so that the words that the two
documents share appear in the grid positions in the overlap of
the corresponding windows. This leads to a compact representation where the theme of the documents smoothly varies
across the grid, achieving a higher density of packing than
previous embedding approaches (e.g. Egypt unrest news are
placed close to other stories about Arab Spring, with Libya
taking another distinct location in that area of the CG; nearby
are stories about oil prices, and near these are more stories
about the markets and economy, near which are stories referring to Fed*s Bernanke, near which are stories about congress
and the President, which, in a counting grid defined on a torus
may loop back to Libya through military themes.) To provide
natural means of summarization and browsing of the documents, we render a CG representation based only on the most
frequent words in each position. We further embed the images from each document into the appropriate locations in the
counting grid, so that they can pop up when the user focuses
on a particular area of the grid (e.g. by mouse over). This provides the user with both a global and local perspective on the
underlying set of documents and their relationships, without
observing directly the underlying documents, but rather the
CG model*s representation of the document space. Once the
user locks on an interesting spot in the grid, the summaries of
the actual documents that mapped in the vicinity are listed for
selection. This idea leads to an intuitive document browser
that is especially well suited to touch devices, where moving
a cursor is the most natural interaction modality, while typing
is particularly difficult. Additionally, the interface assists the
user in discovering documents of interest without having to
define a particular target and associated keywords first: The
arrangement of related words and themes on the grid naturally guides the user*s attention to topics of interest. For an
illustration we describe and demonstrate (video submission) a
browser of four months of CNN news from winter and spring
2011, a period particularly rich in news-worthy events.
COUNTING GRIDS (CGS)
羽z
...
k-1
k
k+1
k+2
...
...
i-1
music
afghan
alcohol
religi
spirit
religi
mean
influenc
spirit
blame
homeland
reason
oil
crazi
recent
rid
realiz
side
lose
hear
failur
conflict
gunfir
unanim
eastern
diplomat
gadhafi
libyan
pope
mean
radic
speech
radic
youth
syria
undermin
islam
repeat
revolut
cabl
cabl
ra
local
fight
sanction
altern
kimoon
secretarygener
resolut
justic
digniti
necessari
aircraft
necessari
offer
unit
cuba
court
guantanamo
trial
attack
oldest
brought
rock
threw
holm
corrupt
ralli
toppl
zine
abidin
tunisia
tunisia
upris
tunisian
flee
capit
upris
african
ambassador
asset
closer
adopt
refer
ban
loss
ban
i
council
council
libya
i+1
attack
particip
islamist
scene
angri
inspir
protest
peac
protest
mubarak
hosni
peac
protest
gather
demonstr
chant
prodemocraci
tuni
ali
capit
regim
clash
libya
tripoli
crackdown
demonstr
protest
protest
africa
opposit
daraa
opposit
libya
ben
libya
civilian
nato
militari
civilian
militari
minist
group
egypt
sanaa
ouster
ouattara
ivori
resolut
rebel
gbagbo
muslim
robertson
gadhafi
moammar
libyan
libya
moammar
nato
gbagbo
rebel
libyan
...
unrest
gadhafi
nato
libyan
libya
libyan
gadhafi
misrata
coalit
nofli
fighter
bomb
nofli
strike
19 Jan 2011 : Protesters march in Tunis amid
Arab League fears
27 Jan 2011 : Social media's role in North
Africa's unrest
11 Apr 2011 : None of us was Gadhafi's
lover, one of his nurses says
28 Feb 2011 : N. Africa, Mideast protests:
Diplomat says Libya long thought Gadhafi
crazy
21 Feb 2011 : N. Africa, Mideast protests Gadhafi: I'm still here
23 Feb 2011 : N. Africa, Mideast protests:
Libya's interior minister kidnapped, media
say
27 Feb 2011 : N. Africa, Mideast protests:
Egyptian stock market set to reopen Tuesday
Figure 1. A part of the counting grid trained on the news stories. Three windows are highlighted along with seven stories that mapped there. Color
indicates the mapping. The movement through the grid captures the spread of the Arab Spring in North Africa, and the subsequent UN reaction.
The counting grid consists of a set of discrete locations indexed by ` in a map of arbitrary dimensions (30℅30 to 40℅40
2D torus grids in examples here). A part of a counting grid
is illustrated in Fig 1. Each location contains a different set
of weights for the Z words in the vocabulary (Z = 10000
here). The weight of the z-th word atPlocation ` is denoted by
羽z,` and the weights add up to one, ` 羽z,` = 1. Thus 羽 is a
probability distribution over words and defines the local word
usage proportions. (These weights are partially illustrated in
Fig. 1 using font size variation, but showing only the top 3
words at each location.) A document has its own word usage
counts cz and the assumption of the counting grid model is
that this word usage pattern is well represented at some location k in the grid in the following way: When a window of
a certain size is placed at location k in the CG, and the CG
weights are averaged across
P N CG locations in the window
Wk to obtain hz = N1 `﹋Wk 羽z,` , then this distribution is
approximately proportional to the observed document counts
hz ≦ cz . In other words, approximately the same words in the
same proportions are used in the document and in its corresponding counting grid window Wk . The window size 6 ℅ 6,
and thus N = 36 was used in our experiments, but due to
space limitations 3 ℅ 3 windows were used in Fig. 1.
The CG estimation algorithm [7] starts with a random initialization which gives all words roughly equal weights everywhere. The subsequent iterations (re)map the documents
to the windows in the grid and rearrange words to match the
weights currently seen in the grid. In each iteration, after the
mapping, the grid weights at each location are re-estimated to
match the counts of the mapped document words. We found
that the algorithm converged in 70-80 iterations, which sums
up to minutes for summarizing months of news on a single
standard PC. As this EM algorithm is prone to local minima,
the final grid will depend on the random initialization, and
the neighborhood relationships for mapped documents may
change from one run of the EM to the next. However, as
shown in the supp. material, the grids qualitatively always
appeared very similar, and some of the more salient similarity relationships were captured by all the runs (e.g. the
Arab Spring news that referred to multiple different countries
with very different unfolding of events are always grouped
nearby). More importantly, a majority of the neighborhood
relationships make sense from a human perspective and thus
the mapping gels the documents together into logical, slowly
evolving themes. As discussed below, this helps guide our visual attention to the subject of interest. As the algorithm optimizes the likelihood of the data, all resources (grid locations)
must be used, and the packing is much denser than in the
previous embedding approaches, thus occasionally squishing
themes together even though no documents map to their interface. In our opinion, it is a small price to pay for high real
estate utilization and, for the most part, intuitive arrangement
of themes
MULTIMODAL CG DISPLAY AND BROWSING
To browse a collection of multimodal documents consisting
of both text and images, we first fit a CG model to the corpus, and then embed the images into appropriate locations of
the grid, so that each image is placed in the grid position in
the center of the window to which the source document was
mapped (Fig. 2). This results in a grid of images of the same
size as the word counting grid with a rough semantic alignment: In each image*s vicinity the grid locations have high
weights on the words related to the image. Obviously, there
is now a multitude of possible approaches to visualizing this
embedding in a way that explores the two modalities in concert. To show the image embedding, we can simply show a
tiling of images (e.g. based on the 30 ℅ 30 CG). In locations where multiple images are mapped, we can pick one at
random (as in our experiments), or the one that was used in
multiple documents, or the one selected by a computer vision
algorithm. In addition, the images mapped to the same location can slowly cycle. To visualize the CG word weights 羽z,`
in each grid location, we show the top k words (k = 3 in our
experiments) using the font size to indicate the word weight.
In our browser, we can switch between the two representations, or show them one on top of the other with a certain
level of transparency (Fig. 2). In addition, a pointer (a mouse
cursor, fingertip on touch devices, etc.) can be used to force
the switch between images and words locally in a window of
a certain size (5 ℅ 5 in our experiments). In this way the user
can base their exploration primarily on one modality, bringing the other modality to the fore by hovering over the grid
parts of interest. In particular, we find the word representation particularly useful in drawing the user*s attention across
related themes to the point of interest. As the user naturally
moves the pointer toward their eyes* focal point the pointer
uncovers images underneath to further refine the user*s understanding of the grid content. At any point, the user can stop
and indicate (e.g. by a click) their desire to see the source
documents that mapped in this region. We implemented two
ways of uncovering the images in the region where the user
hovers. In the first approach, the words in the grid locations
around the cursor are highlighted and the images from these
locations are shown next to the highlighted area. In the second approach, we simply replace the area around the cursor
with images. As the embedding is based on overlapping windows, in both cases it is possible that some of the images that
pop up this way are related to the themes slightly outside the
highlighted area. Once the user is used to this it becomes imperceptible as the matching words (or images) are never far
and slight movements of the pointer help lock onto the topic
of interest. To further indicate the smooth nature of the mapping, we experimented with varying sizes and intensities of
images that pop up. For example, in Fig. 2 the central image
of the highlight is of larger size and it slightly overlaps the
6 images around it, which themselves are larger and overlap
even more the images around them, creating an impression
of the underlying images popping out from the words, with
relationship being approximate but smooth, inviting the user
to move the cursor around.
Although the CG model glues the documents together based
on the vocabulary overlap that can contain a large number of
different words, to a human observer, just the top words for
each location seem to provide enough insight into the thematic shifts in the grid. The grid in Fig. 2 gels the disaster
stories together due to their common vocabulary (e.g. disaster, response, emergency, etc.), but in the browser most of
that shared vocabulary is overtaken by the words that get high
weight in individual locations (earthquake, tornado, airplane,
crash, snow, storm, etc.). The human mind easily detects connections among these and need not observe all of the ※glue§
that linked these topics together. In our experience, the CG
visualization seems to stimulate the user*s own associations
and memory and guides the user to the target even if they did
not start with a particular target in mind: A look at a salient
Japan and earthquake keywords creates an association with
local weather disasters, reminding the user that they were following an airplane crash story. This association process is
guided by CG*s own *associations* so that the spot in the grid
is found quickly. Further interaction with the grid to invoke
visual stimulus increases the pace of news discovery.
To accommodate for variable display sizes and corpora diversities, we can train a hierarchy of CG models of various sizes,
where model of one size is initialized by an upsampled version of the model of the smaller size. In this multi-granular
approach, the user can zoom in and out of any part of the grid.
Window size choice provides the tradeoff between finer document overlaps and the computational complexity of the CG
estimation, but for the CNN news stories at least, the latter
was not a limiting factor.
DISCUSSION
Our approach provides some important advantages over
the existing visualization/browsing/search approaches. The
10x10 grid website 1 also arranges images into a grid. But,
the placement of images is not optimized so that the nearby
locations capture related stories. Previous methods for spatially embedding documents [5, 2] produce sparse representations (e.g. ※The Galaxy of News§ [8] ), which are only
locally browsable, whereas the counting grids use the screen
real estate much more efficiently. In addition, our approach
allows embedding of multiple modalities. Various galaxy
approaches required that the user interact with the embedding through the statistical model, manipulating its parameters and/or weights, which may be impenetrable to the user,
thus requiring a laborious guess and check strategy [1, 6].
This issue is still a subject of research in HCI [3]. In contrast, the CG parameters (grid size and the scope of overlap,
i.e. the window size), are more intuitive, and multi-granular
approaches may remove a need for parameter selection altogether.
The CG visualization reminds one of tag clouds, visual representations that indicate frequency of word usage within textual content. Google News Cloud 2 sorts words alphabetically, varying the font based on the relevance. If a word is
selected other similar words are highlighted. But the links
among the complex documents that combine a variety of
words are not evident. Other tools (e.g., Toronto Sun, Washington Post websites) cluster words based on co-occurrence
or proximity and then position the words belonging to the
same clusters near each other and use color to emphasize the
structure. Still, the words are not spatially embedded within
a cluster, and so only cluster hopping can be performed, in
contrast with smooth thematic drifts found in CGs. For the
most part, the tag clouds are designed to provide a useful and
visually pleasing summary of the news [9, 4], rather than a
1
2
two-dimensional densely organized multimodal browsing index which CG provides. In terms of providing a means for
traversing an organization of news, our method shares some
similarities with Newsmaps3 which use a hierarchical representation, a tree. But the traversal paths descend along the
branches of the tree while CGs often capture many different
directions of thematic drifts which can loop back.
REFERENCES
1. Alsakran, J., Chen, Y., Zhao, Y., Yang, J., and Luo, D. STREAMIT:
Dynamic visualization and interactive exploration of text streams.
131每138.
2. Chen, Y., Wang, L., Dong, M., and Hua, J. Exemplar-based
visualization of large document corpus (infovis2009-1115). IEEE
Transactions on Visualization and Computer Graphics 15 (2009),
1161每1168.
3. Endert, A., Fiaux, P., and North, C. Semantic interaction for visual text
analytics. In ACM CHI (2012), 473每482.
4. Helic, D., Trattner, C., Strohmaier, M., and Andrews, K. Are tag clouds
useful for navigation? a network-theoretic analysis. Journal of Social
Computing and CyberPhysical Systems 1 (2011), 33每55.
5. Iwata, T., Yamada, T., and Ueda, N. Probabilistic latent semantic
visualization: topic model for visualizing documents. In ACM KDD
(2008), 363每371.
6. Jeong, D. H., Ziemkiewicz, C., Fisher, B. D., Ribarsky, W., and Chang,
R. ipca: An interactive system for pca-based visual analytics. Comput.
Graph. Forum 28 (2009), 767每774.
7. Jojic, N., and Perina, A. Multidimensional counting grids: Inferring
word order from disordered bags of words. In UAI (2011), 547每556.
8. Rennison, E. Galaxy of news: An approach to visualizing and
understanding expansive news landscapes. In ACM Symposium on User
Interface Software and Technology (1994).
9. Sinclair, J., and Cardew-Hall, M. The folksonomy tag cloud: when is it
useful? J. Inf. Sci. 34 (2008), 15每29.
10. Thomas, J., and Cook, K. Illuminating the Path: The Research and
Development Agenda for Visual Analytics. IEEE Press, 2005.
3
A
media
spy
corp
push
york
ticket
iran
pack
american
held
steeler
embassi
open
relat
staff
seat
song
barricad
leav
depart
ferri
bullet
land
staterun
charter
malta
aircraft
statement
super
bowl
biggest
weak
close
solid
record
consecut
focu
emploi
tourism
road
tour
eat
wors
observatori
half
fan stadium
crash taiwan
field
australiaflooddodger
plane
airport
vice
construct
hood
negoti
citizen
hezbollah
travel
frustrat
immedi
drill
remov
reportedli
flight
runwai
structur
zealand
seek
michigan
blizzard
ici
mph
total
chicago
ic
snow
oklahoma
rain
metro
fall
omar
pt
mobil
termin
march
food
contact
toyota spokesperson
prime
naoto
subsequ
helicopt
blast
disast
fuel
reactor
weight
devast
affect
record
japan
wave coast
earthquake
boat mile
tsunami
ukushimanuclear
water
explos
evacu
coastal
chile
pose
kilomet
interact
ocean
tower
regul
monitor
accid
fish
leap
studi
asia
lion
scientist
vietnam
columbia
wrote
documentari
side
radio
buckl
habitat
atmospher
particular
veteranforest
learn
hit
wind
heavi
central
atlanta
lotteri
explet
grim
overal
author
washington
visit
check
ticket
detroit
gov
walker
contribut
law
islam
worth
transit
jakarta
controversi
muslim
iraq
foreclosur
rai
imag
coron
judg
testifi
plea
violat
lo
murrai
oprah
post
secret
teacher
perri
tone
critic
connect
listen
radio
terribl
childhood
chandler
span
link
restaur
fisherman
object
border
insid
rio
param
valu
rout
slowli
gen
improv
growth
lohan
look
worri
cia
cent
chines
bed
lindsei
note
increas
price
deadli
far
saudi
case
grace
hln
nanci
heavili
gunshot
snatch
gunmen
hospit
haleigh
rosemari
car
novemb
ireland
san
eventu
pearson
bank
ipad
energi
jump
zoo
quarter
market
index
share
verizon
ad
stock
dow
jone
sell
corpor
averag
nasdaq
industri
injuri
doctor
wit
river
girl
s
cooki
low
compani manag
feed
allow
B
Japan
11-Mar-2011
cl Fuk
ea u
r E sh
m im
er a
ge
nc
y
D
specif
homeland
syria
zine
protest
societi
crazi
repeat
tunisia
ben
unrest
demonstr
capit
protest
bahrain
penguin realiz
cabl
flee
davi
fight
african
medic
awar
act
wild
refer
sanction
crane ban
flown
unknown
held
enforc
shot
campu
raid
cuba
mubara
progovern democraci
libya gadhafi
limit
crisi
ministri
missil
smoke
ukrain
coalit
gbagbo
count
violenc
immigr
perjuri
franc
wound
miami
multipl
alleg
terror
switzerland
airmen
custodi
appeal
misconduct
assang
bin
student
stab
weapon
necessari
nato
militari
bombzone
air
ignit
book
europ
restrict
brown
drug
journal
michael
chicken
sexual
carri
appl
dead
doc
migrant
sex
financ
itali
execut
popular
newspuls
duke
rape
david
page
passport
page
canada
five
cooper
facebook
abl
digit
slain
fear
commit
work
hand
googl
app
suicid
soldier
son
felt
jon
driver
reader
pictur
iraqi
fit
air
traffic
asleep
true
robot
lewi
regist
gaga babi
parent
teen
bodi
try
jersei
search
woman
iowa
mother prostitut
attent
daughter legal
baltimor
barn
handl
card
reach
guarante
agreement
spend
stewart
marijuana
love
knew
good
ring
abortparenthood
shutdown
tax
cut
run
shut
health
fund
billion
debt
job
loomi
want
reader
budget bush
usa
queen
job
sure
ly
soccer
internet jame
good
tonight
wood
career
pitcher
school
break
cleveland
playoff
victori
basebal
brandon
deal
game
cap
butler
sport quarterback player
leagu
ncaa
tournament
nfl
denver
pit
number
championship
hometown
tiger
thought
photo
martin
face
joi
fbi
pennsylvania
ramirez
fellow
ryan
perman
king
cup
england
royal
explain
hack
respect
nick
just
palac
wed
happi
social
facil
embassi
babi
past
wikileak
high
princip
leonard assist
patrick
5
6
7
wind heavi
wind
airport cancel
tulsa
tulsa
system
collaps
airlin temperaturovernight region
blizzard
ici
mph storm tornado tornado south
outag weather forecast weather alabama
file
dalla
chicago blizzard storm weather midwest servic southern
structur zealand seek
futur
cross
send
clear christchurch main
snow snow storm storm line southern
food contact toyota
spokesperson
chicago
blog infrastructur suppli hardest
total
dalla
ic
inch oklahoma southeast servic northern
phone marathon sendai
effect
declar
et
warn missouri plain
morri tennesse brace
et
western
naoto subsequ helicoptcoastal weight chicago
oklahoma
snow
donat impact ireport
et
ic
inch
snow snowstorm tree
brace
kan
expos
impact
fat
shore
feet
snow
ic
inch vermont georgia georgia
veget
test
disast japanes tsunami devast
metro
affect
record
fall northeastgeorgia lightn
cold
highwai blast
tree
japanes japan
inch
fallen
toll
trigger
warn advisori
custom emerg stress
reactor prefectur earthquak tsunami
hit
feet accumul northwest affili
fuel
reactor
japan
wave coast motorist arctic ground atlanta
tokyo tsunami tsunami
earthquak ireport
feet
mile
event atlanta
path
hail
fukushima tokyo
littl
escap
touch
radiat radioact plant damag magnitud harbor extent remot arkansa
sea arkansa highwai
dig
fukushimanuclear japan japan quak boat
mile
bu
daiichi radioact tokyo
stuck
mile
cover
rock
pull
firefight
tokyo magnitud quak
radiat reactor nuclear damag earthquak pacif
ocean
area
sgt
fell
hunt
bear
explos nuclear nuclear
water quak chile volcano globe commiss difficultineighbor
headquart radiat
yukio
pool
trace
struck
floor
hawaii
bird
ev
bear
pull
contain
level
crippl rescuer kilomet kilomet
usg
pattern trauma midnight neighbor firefight
pose
kilomet interact
evacu
secretari level
rise
core
metropolitan
chernobyl chernobyl initi
evacu
small
2
3
4
5
6
7
8
9
10
11
12
13
regul
updat
mississippi
winter
rain
Weather Disasters
japan
water
tri
4
malta
sat
sourc
hood termin airport
crack
cyclon
port
fort tomorrow herald
frustrat airport
expedit flight
fly
milwauke
malta aircraftplane
mobil
path
collabor
juri
hay
bullet
land
land
jack
republ charlott
hole
hole
happen
pilot
tribun
updat
updat
michigan
detroit
explos
maryland
transport
egg
cyru
arial
yemen
syrian
libyan
council
court
johnson
egypt
cairo
reported flight runwai
Snow Emergency MidWest
Hawaii Volcano Feb-2011
Chile
11-Feb-2011
27-Feb-2011
06-Mar-2011
system
peac
canadian
North Carolina
Tornado
Tennessee Tornado
Arkansas
Tornado
Pakistan/Iran
18-Jan-2011
inspir
artist
C
Mississipi Flood
Georgia Thunderstorm
Nu
islamist
corrupt
counti
8
9
10
11
12
13
14
Counting Grids coordinates
Tsunami
New Zeland
21-Feb-2011
rock
speech
america
ferri
sourc
Earthquake
pope
influenc
displai
food
tourist
robertson
spirit
alert
educ
sheriff suspect
brotherhood
squar
attack
terror
hear
shoot
ivan
nurs
conserv
constitu
trip
revolut
great
jim
loughner
outlet
nomine
advantag
rabbit
resign
music afghan
tie
presenc
somebodi
berlusconi
oil
machin
mike
newton
mental
arizona
gunman
china
citi
taken
giffordtucson
supermarket
januari
cultur
teacherpresent
isra
signauction
palestinianjudg
wisconsinunion
document
defens
california
brother
supremnegoti
afghanistan
actress
station
elimin
ballot
date
invit
yale
chrysler
vote
submit
believ
bone
iphonbernank
candid
luck
audienc
understand code
hama
press
nobel
anniversari
white
fire
femal
million
winner
chetri
discov
honor
shourd
bu
neighbor
observ
conclus
medic
counti
dig
difficulti
battl
reserv
birther
rick
doctor
respond
korea
comment
ohio
rep
resid
gold
miller
castro
door
colombia
ted
statu
brief
skull
confer
laugh
kasich
robert
branson
hostag
apolog
comment
repeatedli
weekend
daili
seven smartphon
muslim
releas
northeastgeorgia lightn
mississippi
winter
william
rd
journalist
progress
husband
decemb
publicli
capitol
wife
franklin
logan
bill
complex
investig
marriag
funer
vessel
son
straight
care
match
polic
book
brewer
ball
death
zombi
sen
wear
murder
fashion
senat
line
independ
deadlin
bird
gift
dont
birth
option
anim
hold
statu
aid
ground
delai
crew
parti
highwai
former
putin
appeal
elect
arctic
afternoon
drill
rule
reform
taylor
command
agreement
injunct
vice
rehabilit
academi navi
owner
circuit
biden
readi
sailor
richard
buffett
gibb
virginia
assign
free
anderson
award
novel
appreci
elector
hero
male
claim
yard
duncan
film
olbermann
answer
certif
presid
jersei
climb
bail
toilet
walter
councilman
treat
mad
morgan
poll
best
eight
schiller
jonathan
blood
guilti
statu
heart
samantha
western
reed
prison
deliv
draft
thank
job
second
southern
arkansa
april
show
houston
storm
sea
spacecraft
tattoo
south
motorist
fish
differ
snake
result
file
homelessmessag
train
tornado
planetmercuri phasejackson
research
discoveriearth
space nasa
challeng shuttl
mel
volunt
park
choos
sheensudaninterview
violent
univers
experi
burn
round
kentucki
memori
photograph
wildfir
hurt
volcanoglobecommiss
shark
plastic
taco
fun
whale
buri armstrong farm
busi
aviat
airlin
hire
swept mcdonald
cairo
museum
kati
outreach
15
16
17
...
Figure 2. Browsable counting grid. A. The text and image representation of the grid are combined with emphasis on text. In two locations images are
brought into the foreground. The grid is defined on a torus (with left matching the right and the top continuing at the bottom). Various theme drifts
are visible, e.g. the japan-tsunami-water-whale-study- scientist-research-development-space-shuttle-nasa-command-navy semicircle on the left, or the
region emphasized in B) and C) which captures the various disasters from the period. More can be seen in figures in supplemental material and the
video submission. The preprocessing of the words reduced them to their roots and also made other standard alterations used in text analysis, but the
unaltered words can be shown instead. B. Images mapped in the highlighted area. C. More of the top words in the highlighted area, and an illustration
of how the images were embedded: As each document maps onto a window, the images from the document go to a location in the window (top left in
the illustration to avoid clutter, but the middle of the window in actual implementation to provide more natural alignment). D. Some of the news that
mapped to the highlighted area. The area of interest can be selected by cursor hover and the news can be recalled by a simple click.
................
................
In order to avoid copyright disputes, this page is only a partial summary.
To fulfill the demand for quickly locating and searching documents.
It is intelligent file search solution for home and business.
Related searches
- words that stand out on a resume
- open documents on windows 10
- find lost documents on computer
- select documents on japanese war crimes
- old documents on this computer
- create documents on pc
- find documents on this computer
- find all word documents on this computer
- skills that look good on a resume
- where are my documents on this computer
- can t open documents on desktop
- can t see documents on desktop