Reading #3

Computational Information Design

By Ben Fry

The ability to collect, store, and manage data is increasing quickly, but our ability to understand it remains constant. In an attempt to gain better understanding of data, fields such as information visualization, data mining and graphic design are employed, each solving an isolated part of the specific problem, but failing in a broader sense: there are too many unsolved problems in the visualization of complex data. As a solution, this dissertation proposes that the individual fields be brought together as part of a singular process titled Computational Information Design.

→ read chapter 3 (pp. 33–50) of his dissertation

  1. Fry quotes Tufte, stating, "graphical excellence consists of complex ideas communicated with clarity, precision, and efficiency,” to solidify the understanding of the role of graphic visualization as "a divergence from the expectation that somehow visual design serves to make data pretty or entertaining."

Has this rule changed now? Even if this is the purpose of the visualization, to give our mind a diving board to jump off of to "amplify cognition," isn't there some level of aesthetic pleasure involved? But does this aesthetic value also create an hindering or bias?

  1. Visualization helps create clarity but when big data is presented, is it only valuable to see trends and correlations or can it be just as valuable to understand data sets individually?

  2. There seems to be a separation between the people generating the data and those who are creating the visualizations for synthesis, how can we bring them together to create a interdisciplinary export of data?

R3

"Instead, the field of graphic design can be employed, which provides the skills to weigh the
variables against one another in the context of the overall data set, and
to handle the resulting image as a whole."

Q1 - It sounds like data visualization is about presenting the relationships between variables in a better way. What if the variables can't be compared against each other?

"Visualization – the use of computer-supported, interactive, visual representations of data to amplify cognition."

Q2 - Why does visualization and information visualization have to be "computer supported"? Designers can produce interactive data models by hand.

"The books fall short in addressing three important aspects of contemporary information design problems. Notably absent are situations in
which the data in question is undergoing continual change. In addition,
none of the examples have the complexity of something as vast as the
human genome. And finally, the texts have little to say of methods for
interacting with the data as a way to learn about it. "

"Three dimensional spreadsheet systems have not yet taken hold as a standard means of
understanding multiple layers of data as implemented in Financial
Viewpoints, perhaps because while an interesting concept, it diverged
too far from standard modes of understanding data employed by the target audience."

"He balances the automation of the computer against the reasoned hand of the designer to build systems that, rather
than replacing designers, augment the abilities of the designer or open
possibilities where there were none before. This distinction is important,
because it provides more a realistic attribution to the abilities of
machines (primarily to reduce repetitive tasks) versus those of human
beings (ability to reason and make complex decisions). Without a major
breakthrough in artificial intelligence, this distinction will hold for many
years to come."

Q3 - Is AI the answer to continual and complex information visualization? To be completely realistic, it is not possible for a human to design something as complex and vast as the
human genome. In other words, how much we can design is limited to how much data we can process and understand, and so far computers are doing a better job at that.

  1. Frye describes data visualization as operating as "external cognition" - can this ever be done with more qualitative information?

  2. Should there be a step added to Grinstein’s 5 steps that incorporates user testing to ensure that the message being communicated is clear and approachable?

  3. It is interesting and exciting to think about the machine reducing repetitive human tasks. Is there a time though that designers loose some potential insight and intimacy with their data as they rely on programing?

1.Does a visualization necessarily have to be computer-supported?
2. How come visual design tends to be left out from these definitions and sets of steps?
3. We can understand patterns easily, but how can we bring attention to small nuances that might be important?

1 - How are we defining 'visualization' and 'information visualization' now, especially when most of what we see online is a visualization of information/data?

... What about our cognition and downloading cognition into technology? What form of visualization is that?

2 - How can we extend/include visualization into spatialization?

... If we respond better to stories than statistics, could/should we consider using our own spatial abilities and senses as a medium to communicate data?

3 - How do we collect, analyze, and communicate data that is not qualitative?

Q1 - Ben Fry illustrates the importance of "pre-attentive" information, using the example that the human brain can more quickly parse information that is visual in nature and come to a qualitative understanding. Whereas statistics deals with a set of well defined biases in the collection of data (voluntary bias, response bias, etc), what biases are we likely to encounter in data visualization? Are the biases justified because of our pursuit of a qualitative understanding of the information? At what point do our work become uninformative?

While that may be a side effect, the issue is not that the visual design should be “prettier”. Rather, that the approach of the visual designer solves many common problems in typical information visualization...The issue is about diagrams that are accurate, versus those that are understandable (or subjectively, those that viewers prefer to look at).

Q2 - Ben Fry argues in his example of the Treemap project that the visualization suffers from a number of layout issues. How can we quantify these issues, or are they purely subjective? For instance, the person who created Treemap may have a thorough understanding of the information, and the "visual noise" we consider are only there to help further his understanding. It thus seems like our efforts are focused on giving the lay person a way to come to a quick subjective understanding of the information.

All visualizations seem to require the viewer to have some degree of contextualizing or orienting information. Considering the designer as a lay individual, are there, and how can he/she create visualizations that require a thorough understanding of the subject matter?

Q3 - How does a designer deal with contaminated data? What is the likelihood that any given person will look at a visualization multiple times, more specifically, an updated visualization? Given the difficulty of censoring information, how can we ensure that correct information is disseminated?

Not sure how effective it is to have smaller digestible visualizations when the data set is huge, you'd only see the bigger picture; maybe this is where interactive visualizations become important in understanding both micro and macro scales?

Are the best visualizations the ones that are understood without a written description, or the ones most quickly understood?

Should we standardize a set of tools, processes for data processing and visualization? Standardization would allow for better development of tools and more accessibility to anyone interested in using it (not just designers and programmers).

  1. Which of Fry's four points (that existing data design paradigms do not account for extremely large data sets, changing data sets, interactive platforms, and accessibility by a wide range of users) still apply ten years later, and in what contexts?

  2. How much statistics should designers know, and how much design should statisticians and scientists know? What's a productive way to bridge this gap without creating untenable expectations?

  3. How have the roles of stats and dataviz templates changed over the years? How might they continue to change; are they being used responsibly?