Friday, December 30, 2016

CiteRivers: visual analytics of citation patterns

Note:

This visual tool is aim to help explore the citation network of the given publications (conference proceeding). It shows the citation word cloud, trend, diversity, author and publisher venue.

Points:
  • Not clear of the scale of stream panel and the relation with spectral clustering. The benefit of using clustering techniques to show the publication in a river style is not clear. 
  • An across stream citation analysis would be useful, i.e. to select more than one cell of the river. 
  • The word meaning in the word cloud may be varied. E.g. the network is with multiple meaning across different research, even in the same domain. 
  • The user case showed the citation pattern of given IEEE publications, but lack of the discussion of the found pattern. This may be the key value to the target users. 
  • A user case that may be interesting: The given year publications are major based on which year's work? This could be a influence index for the past works (also the scholar). 



VIS15 preview: CiteRivers: Visual Analytics of Citation Patterns from VGTCommunity on Vimeo.


Reference:
  1. Heimerl, Florian, et al. "CiteRivers: visual analytics of citation patterns." IEEE transactions on visualization and computer graphics 22.1 (2016): 190-199.


A visual analytics agenda

Note

This paper points out the potential research directions for visual analytics.

  • let user obtain deep insight, assessment, planning and decision making. 
  • let user see, explore and understand large amounts of information simultaneously
  • convert all types of conflicting and dynamic data in ways that support visualization and analysis.
  • communicate the information in the appropriate context to a variety of audiences. 
The science of analytical reasoning, take a crisis event as example.
  • understanding historical and current situations. 
  • identifying possible alternative future scenarios
  • monitoring current events to identify both expected and unexpected events. 
  • determining indicators of the intent of an action or an individual.
  • support the decision maker in times of crisis. 

visual representations and interaction technologies

  • facilitate understanding of massive and continually growing collections of data of multiple types. 
  • provide frameworks for analyzing spatial and temporal data
  • support the understanding of uncertain, incomplete, and misleading information. 
  • provide user and task-adaptable guided representations that enable full situation awareness while supporting development of detailed actions. 
  • support multiple levels of data and information abstraction, including integration of different types of information into a single representation. 
Data representations and transformations
  • transforming data into new scalable representations that faithfully represent the underlying data's relevant content. 
  • synthesize different types of information from different sources into a unified data representation, so users can focus on the data's meaning in the context of other relevant data
  • develop methods and principles for representing data quality, reliability and certainty, measure through-out the data transformation and analysis process. 

Reference
  1. Thomas, James J., and Kristin A. Cook. "A visual analytics agenda." IEEE computer graphics and applications 26.1 (2006): 10-13.

Effectively Communicating Numbers & Tapping the power of visual perception

Note

A good introduction white paper about how to present the quantitative information for business. It is a good basic reading material [1]. The data visualization is used to enable the "visual perception" of the user, "human visual system is a pattern seeker of enormous power and subtlety". In other hand, if the data present in different way, the user may not be able to catch the invisible patterns, which makes the less effective communication.

The human eye is catching light and translate them into color and thoughts [2]. The light shines on the fovea area would be highlighting to catch more attention. The other parts of retina may with less detail, but the capable and ready to catch any point of changes, e.g. something moving or pop-up. Besides, the human brand is with long and short term memory. The short term memory is processing and discarding the received information, like a RAM in computer. The speed is quick but with very limited capacity. The long term memory requires more time to organize but can last longer for later use, like a hard drive. It would be crucial to design the visualization follow the nature of human brand preference.

There are two kind of attention of visual perception, pre-attentive and attentive. The pre-attentive is processing very quick and parallel, like pop up in your eye. For instance, a serial number with highlighted target number. The highlighted number would jump out the serial of number for the user to recognize. The preventative attribute can only be accurately to encode number in 2D locations, e.g. 2D scatter plots. More than 2D would turn the display into attentive process, which requires more time and serial process effort. One exception is to use the colored points for categorical distinguish. Hence, a 2D scatter plot with color categorical may be the best use case for user to understand the data.

Reference
  1. Few, Stephen, and Perceptual Edge Principal. "Effectively Communicating Numbers." Principal Perceptual Edge. White Paper. Downloaded from (2005).
  2. Few, Stephen. "Tapping the power of visual perception." Visual Business Intelligence Newsletter (2004).

Google+ ripples: A native visualization of information flow

Note

A nested circles style to present the temporal pattern of re-sharing. The sharing action is structure as tree-map. The nested circle helps to highlight the cluster in each branch. This paper discusses the design factors included: social media sharing pattern, rendering, interaction and animation. I think it would be a useful way to tell the story about the temporal, social network trends. The display is bright for the user to understand the whole picture of the certain topic or post to spread.



An extend reading of the nested circle of [2]. The paper models the exploratory search tasks as a radar plot. The user can drag the interested item into the plot to filter the result. In [1], the figure helps to show the social media sharing pattern as circles, however, in [2], from a different perspective, to help the user to filter the result. The two scenario may mutually relevant.

Reference
  1. Viégas, Fernanda, et al. "Google+ ripples: A native visualization of information flow." Proceedings of the 22nd international conference on World Wide Web. ACM, 2013.
  2. Kangasrääsiö, Antti, et al. "Interactive Modeling of Concept Drift and Errors in Relevance Feedback." arXiv preprint arXiv:1603.02609 (2016).





Thursday, December 29, 2016

The structure of the information visualization design space

Note:

This paper provided a framework to organize and structure the visualization plots. It considers the following features:

  1. Data Type: Nominal, Ordinal, Quantitative, Intrinsically Spatial, Geographical, Set mapped to itself
  2. Function for recording data: filter,sorting,multidimensional scaling,interactive input ot a function
  3. Recorded Data Type: same as Data Type
  4. Control Processing : tx (text)
  5. Mark Type: point,line,surface,area,size
  6. Retinal properties: color, size, connection, enclosure
  7. Position in space time: position in space time, N (Nominal) O (Ordered) Q (Quantitative)
  8. View transformation: ::=nb (hyperbolic mapping)
  9. Widget: slider, radio buttons


For example: Multi-Dimensional Tables


Points: 1) many of the visualization is not web-based. Is there any particular reason to use web standard? 2) if the web-based visualization, what is the framework different? e.g. the web-based application may using more mouse gesture to click, scale and hover. Or, with help of useful libraries like D3.js, how does it influences the implementation of data visualization? 3) the design space for non-web-based applications are more open and less limitation, but accessibility is weak to share and collaborative.

Worth to read more: [2], [3], [4] for the web-based space of data visualization.

Reference:
  1. Card, Stuart K., and Jock Mackinlay. "The structure of the information visualization design space." Information Visualization, 1997. Proceedings., IEEE Symposium on. IEEE, 1997.
  2. Figueiras, Ana. "A Typology for Data Visualization on the Web." IV 13 (2013): 351-358.
  3. Turetken, Ozgur, and Ramesh Sharda. "Visualization of web spaces: state of the art and future directions." ACM SIGMIS Database 38.3 (2007): 51-81.
  4. Brath, Richard, and Ebad Banissi. "Using Typography to Expand the Design Space of Data Visualization." She Ji: The Journal of Design, Economics, and Innovation 2.1 (2016): 59-87.

A Tour through the Visualization Zoo

Note

This paper introduced the basic figure plots for data visualization. The mentioned schemes included:
  • Time Series Data: Index Chart



  • Time Series Data: Stacked Graph


  • Time Series Data: Small Multiples



  • Statistical Distribution: Horizon Graph




  • Statistical Distribution: Stem-and-Leaf Plot



  • Statistical Distribution: Q-Q Plots
  • Statistical Distribution: Scatter Plot

  • Statistical Distribution: Parallel Coordinates

  • Maps: Flow Map



  • Maps: Choropleth Map

  • Hierarchies: Node-Link


  • Adjacency Diagrams: Lcicle Tree Layout

  • Adjacency Diagrams:Enclosure Diagrams



  • Network: Treemap


  • Network: Nested Circles

  • Network: Force-directed Layout



  • Arc Diagram



  • Matrix View



Reference
  1. Jeffrey, Heer, Bostock Michael, and Ogievetsky VADIM. "A Tour through the Visualization Zoo." Communications of the ACM 53.6 (2010): 56-67.

High-dimensional data visualization

Note:

This paper introduced the basic figure plots to display the multi-dimensional data. The mentioned schemes included:

  • Mosaic Plots
This plot is good for categorical data display, for the user to compare the different between features. But it requires the user to pay attention to multiple directions (top/bottom, left/right), which makes it harder to follow, less user perception. Besides, this plot provides a quick overview categorically, but for ordinal and interval variables.

  • Trellis Displays

Nice to provide a comparison between variables, not suitable for temporal data and categorical data. Besides, many of the cells may repeating or empty.

  • Parallel Coordinate Plots

Nice to show the temporal data, requires the skill to solve the overplotting, scaling and sorting problems.

  • Projection Pursuit and the Grand Tour



Not easy for the human brand to process a 3D plot, but it shows the dynamic between the dimension projection. For instance, using a scatterplot with 3 dimensions, let the user explore the pattern across dimensions, is one type of grand tour.


Summary


A summary with the functionality of exploration and presentation included the interactivity of each plot. However, I think the Trellis may also provide interactively, e.g. this demo


Reference
  1. Theus, Martin. "High-dimensional data visualization." Handbook of data visualization. Springer Berlin Heidelberg, 2008. 151-178.

Wednesday, December 28, 2016

Collaborative visual analysis with RCloud

Note

This paper discussed a collaborative visual analysis environment for a team work. For a data science related project work, it is very common to design, analyze and deliver the result to target audience, could be a colleague, customer or your boss. This is a process of exploratory data analysis (EDA). This paper argues the works are usually done by different tools, i.e. coding in scripting language and design the interface with web techniques. This makes the collaborative work very difficult, due to lack of discoverability (code reuse), technology transfer (collaborate) and coexistence (plus interactive visualization tool). Hence, this paper proposed a framework - RCloud, which using R to integrate the back end analyze and front display in a restful API structure. The basic idea is every application natively demonstrates the result to users through web browsers. This framework is re-using and coupling the existing package in R.

Points: in a small scale teamwork size and low dynamic of project requirements, I think this framework would work well. However, if more and more projects (usually small and not mature result) go live, the search and re-use may create extra workload for the developer. In another hand, the R package may not be suitable to solve all the practical problems, e.g. a large scale data storage or distributed computing tasks. Besides, there are more framework options to better facilitate the collaborative between developer and designer, e.g. the MVC framework. I think a good framework should stand alone with the specific language and techniques, so it can generally support to dynamic real world requirement.

I actually like this idea, it shows the values to deliver the beta works to the users. It 'd be good if we can put the research finding or preliminary result on the web for a better potential collaborative, public exposure, and self-advertisement. The other trend is using Scala to bundle the analysis, implementation, and production.

Reference
  1. North, Stephen, et al. "Collaborative visual analysis with RCloud." Visual Analytics Science and Technology (VAST), 2015 IEEE Conference on. IEEE, 2015.

EgoNetCloud: Event-based egocentric dynamic network visualization

Note

A quality work on network visualization, this paper proposed a visual analytic tool to display the structure and temporal dynamics of an egocentric dynamic network [1,3].  It considered three important design factors in this work: 1) network simplification: to show all the links in the network graph is meaningless and over the information loading for users. A reasonable way to "prune" the node to highlight the important nodes is necessary. It firstly defined the weighting function by co-author number and ordering. Based on the weighting function, the authors tried four different approaches to pruning the node, to maximize the efficiency function, which maxes the weighting in the sub-graph.
2) temporal network: the temporal information present by horizon graph with an axis of time. It would be a simple task to identify the distribution over time; 3) graph layout: the layout designs with a 2D space. Due to the temporal relationship, the chart divides into several sub-graph that hard to fit by regular force-directed graph layout. They extend the stress model to calculate the ideal design [2].

Points: 1) the research methodology of visual analytic: from design, implantation, case study to user study. The user study design is a useful reference for my research; 2) considering the single publication as an event to form the egocentric network. It may supports to multiple use cases, e.g. urban computing, conference, news event, etc. This system is suitable to explore the relationship of a given dataset, for a temporal and egocentric related tasks; 3) the interaction of slider on time and weighting items is useful for a user to explore the content. It may potentially help a user to understand the deep relationship of the given person. This idea may also link to the explain function in the recommender system.

A worth to read citation [4].

Reference
  1. Liu, Qingsong, et al. "EgoNetCloud: Event-based egocentric dynamic network visualization." Visual Analytics Science and Technology (VAST), 2015 IEEE Conference on. IEEE, 2015.
  2. Gansner, Emden R., Yehuda Koren, and Stephen North. "Graph drawing by stress majorization." International Symposium on Graph Drawing. Springer Berlin Heidelberg, 2004.
  3. Shi, Lei, et al. "1.5 d egocentric dynamic network visualization." IEEE transactions on visualization and computer graphics 21.5 (2015): 624-637.
  4. Zheng, Yixian, et al. "Visual Analytics in Urban Computing: An Overview." IEEE Transactions on Big Data 2.3 (2016): 276-296.

Tuesday, December 27, 2016

Following scholars



Network Science
Recommendation System
Visualization

Thursday, September 8, 2016

A review of NSF funding on recommender system explanation.


Summary

I found out the explanation of recommender system is a potential research subject in a few different areas. So I did a review for the relevant project within all the NSF funding award list. Here are some of my thoughts for each project. 
  • This project focuses on the exploration of the influence factors of recommendation and explanation on the user interaction in social media. Hence, the experiment is basically followed the existing functions in the social media services, e.g. Facebook and Twitter. This project implied the recommendation and explanation provided by social media did change the human behavior of online activity. There are always many criticisms of the social media "manipulation" the public opinion by its ranking algorithm. This concern makes this project is meaningful, due to the understanding of how the information affect user's preferences is still little known. According to the structure in my previous post, this is the first layer of the structure. 
  • This project related to what we are doing now of the people recommendation on CN3. I think it did require a mobile version of application to better answer the human face-to-face interaction. This would be the current trend to conduct the study. The project plans to conduct the experiment for a group of freshmen students. I think this would be a more stable user study setting for a long-term basis interaction pattern collecting. 
  • This project is very relevant to what I plan to do. They focus on the issues on sensitive data of user in the personalized system. First, they plan to distinguish the challenge for mobile user with sensitive contents. Second, develop a system with different personzied techniques to measure the prevalence. Third, identify the personalized political content. Fourth, personalized financial and health information applications. I think the PI and co-PI have a strong connection with commercial companies to gather necessary data set and user study environment. With a real world data set and system, it would be more make sense to claim the finding of privacy challenge and patterns. I wonder if there is a need for an interest-conflict-free study from my side? Say, a standalone small scale experiment to recall or further explain some of the issues can not be answered with the real world system or dataset. 
  • This is a project for scientific data visualization, using an animation format. Our goal is not very relevant to this one, but the idea about how them formation the data into animation and make user easy to understand or use. This could bring us some insight about how to convert the recommender system result into a user-easy-understandable format. 
  • This project focuses on the decision support from machine learning, to answer how the interpretable of machine learning techniques can help the user to make a better decision. They focus on a well-known classifier - KNN as an example, to examine the interpretable in three metrics: simplicity, verifiability and accountability. The experiment is focused on how to make the classifier is intuitive to users, predictable and controllable. The similar idea can be also suitable for recommender system. But the problem is how to make the idea is novel and not just repeating the same idea from this project? 
  • A new direction of integrating human behavior to machine learning algorithm. This project is aimed to better facilitate the human behavior with the design of machine learning algorithm. There are some latest publications start to answer the issues and challenge in this area of "Humans in the Loop". This project is also focusing on the decision process and support from human. Furthermore, to design a better interface to connect the learning algorithm and human behavior. Ultimately, as a human interactive learning system. I am wondering the connection between human in the loop and the human in the "user modeling". In recommender study, we model the user based on their preferences and historical data, there is few studies discuss about how to let user to join and understand the process. 
  • I think this project answers my concern in "III: Medium: Machine Learning with Humans in the Loop". This project intends to develop an interface to extract users' high level knowledge, to a better user or data modeling. The visual interface can help to better analyze the human-machine interaction. As a long term grant, this implied some of the potential in this direction of research. But I think the visual analytic approach is only one of the way to engage user into the system. In some of the cases, for example, the text mining, the visual tool may not that useful. There may be some further potential work I can pursue in recommender system study. 
  • This project discussed more about the public awareness of the widely used algorithms. This is the latest funding since July 2016, which means, the state of the arts of the current researches. According to my literature review, this is right on the spot of the most potential research topic in this area. What I want to do on recommender system is pretty similar to this one. The main difference is that, I focus on more about the recommender system to item or people, but this project emphasizes more on the social media. However, I agree, based on the social media would be more suitable or simply to respond some of the interesting issues or channledge across disciplines. For instance, the law and ethics of the post ranking on Facebook or Twitter to affect the user political leanings. Even so, I think the recommender system can answer this question in a more genelize perspective, say, how the transparency and help answer the media bias. Also, the more potential issues in different area, e.g. e-commerce or location-based people system. 


(*Rank by start-end funding year.)

Tuesday, August 9, 2016

Explainable Artificial Intelligence Systems



In a military information system of training or tactics, an after-action-review(AAR) is the most common use approach to learn from the exercise. With the complexity of artificial intelligence system, it is hard for user to interact or question the outcome from an "AI-controlled system". This feature caused the following challenges: 1) the user is hard to understand how the result is made or processed; 2) the user feedback is not well considered by the system; 3) the situation is hard to re-produce for training or debug purpose, the users need to re-run the system until the certain criteria occur. In [1], the research proposed a user interface for a military simulator system. The user can "ask question" by subject, time and entity. 

However, the user interface can only provide the "straightforward" information for users. For example, during a simulator, the user can ask "What is your location/health condition/current task"? All of these are only the attribution in the system that's not difficult to retrieve and display. In today, with the data mining and machine techniques, many of the attribution are lacking of a straightforward way to explain it. For instance, a decision made by targeting system with a deep, multiple layer neunor network, with a hundred rounds or training and testing. In this case, to provide the explain of why choose A instead B would be a more challenge issue. 

For national security and military purpose, this issue is even more critical in following aspects: 1) for training purpose: if the users have no idea how the system works, it is impossible for users to interact or even correct the wrong decision make by the system or algorithm; 2) accountability: if the system made a wrong decision, it is hard to account the responsibility between human and machine; 3) security issue: if all the data process and analysis are in a block box, there is a security concern to use the technique in the real world environment - no one knows if the system was hacked or wormed. 

The military system seems far away from us, but actually, the similar issue has been discussed for personalized system in [2]. There are several main issues for a personalized system without a "scrutinize and control": privacy, invisibility, error correction, compatibility across systems and controllability. It seems to have an overlapping between the two research directions. In personalized system, the research focus on the intractability of the user modeling to help with the system effective, trust, transparency and comprehensive. In explainable artificial intelligence systems, it more focus on AAR for military purpose. For example, in [3], it provided another case of how the exploitation of AAR help users in the medical training session, for self-evaluation and problem-solving. The explainable AI system plays an educational role for training purpose. 

Either of the two directions, plus the state-of-the art machine learning techniques would be a great research subject. Here is the a note for three layers, machine learning categories: 


  • Layer 1: Classifier (Supervised)
    • ADA-Boost, Logistic regression, SVM, Knn, Naive Bayes, decision tree: The classification method is basically trying to find out a point, line or faces split the 2 to N type of elements, based on the feeding training/testing data
    • The issue here is we need to extract the “features” from the raw data, rather than a set of raw data. The features should reflect the original data property as possible.
    • It would be simple to show the feature in different latent space. For instance, to show a regression line to distinguish the classification question. 
  • Layer 2: Markov Chain (semi-Supervised)
    • Hidden Markov Model (HMM): based on a series of decision process, to find something unknown.
    • In Markov model, we need to define the motion as a sequential state, a series of observations. The model is trained for maximum the output probability.
    • In [4], the author dealt with the similar issues for a neuron network decision process (control and interaction), a self-explainable approach is still unknown. 
  • Layer 3: Deep learning (Unsupervised)
    • Convolutional neural network (CNN), Recurrent Neural Network (RNN), Deep Neural Network (DNN): automatically extract features from convolutional, recurrent or deep strategies, plus the method above to train/test the models.
    • In the two approaches above, the human needs to extract the features based on some inference. For example, a concept from physics. These features are interpreted by some prior-knowledge. What if in some of the cases, the feature extraction is almost impossible? E.g. The image recognition.
    • In this layer, many of the features are not recognizable. In face, we can use "eigen-face" to visualize the image recognition features. For the other domain, it is hard to visualize the features. Furthermore, the state-of-the-art approach combines the classifier in layer 1 and the feature extraction in layer 3. It remains many challenge research topics in algorithm, interface and human perception. 

Reference
  1. Core, Mark G, H Chad Lane, Michael Van Lent, Dave Gomboc, Steve Solomon, and Milton Rosenberg. “Building Explainable Artificial Intelligence Systems.” In Proceedings of the National Conference on Artificial Intelligence, 21:1766. Menlo Park, CA; Cambridge, MA; London; AAAI Press; MIT Press; 1999, 2006.
  2. Kay, Judy, and Bob Kummerfeld. “Creating Personalized Systems That People Can Scrutinize and Control: Drivers, Principles and Experience.” ACM Transactions on Interactive Intelligent Systems (TiiS) 2, no. 4 (2012): 24.
  3. Lane, H. Chad, et al. Explainable artificial intelligence for training and tutoring. UNIVERSITY OF SOUTHERN CALIFORNIA MARINA DEL REY CA INST FOR CREATIVE TECHNOLOGIES, 2005.
  4. Kim, Been. Interactive and interpretable machine learning models for human machine collaboration. Diss. Massachusetts Institute of Technology, 2015.

Monday, August 1, 2016

Thoughts on Exposure to ideologically diverse news and opinion on Facebook


Summary

The ranking algorithm on the social media is always a controversial issue across disciplines researchers. For instance, the famous debate of filter bubble and echo chamber effect. The state-of-the-art data mining and machine learning techniques, actually, enforcing the phenomena. Due the Facebook ranking algorithm become smarter and smarter, your post wall is filled with all the content you preferred, lack of diversity or multiple voices. Even worse, manipulated by some of the commercial or private purpose.

In this paper that Facebook published in Science magazine, it is the first time to response this issue based on the real world massive data set. The finding are: 1) the stronger ideological alignment will come with higher share numbers. In other words, the article with strong perspective would be re-post more from users, and not a surprise, by the same alignment of users (i.e. Liberal users will tend to re-post more liberal articles and vise versa); 2) the homophily of the friends. The users with the similar ideological affiliation will tend to friend each other on Facebook. The data analysis reflects a clear pattern that the liberal and conservative both with less friend ties of different ideological affiliation, in other words, less diversity. 3) The crosscutting percentage is dropping when the content explosure decreasing. More specifically, when the user can randomly browse all the content on Facebook post, there are around 40-45% of crosscutting rate. However, the rate dropping dramatically, when the user selects from within friend circle, algorithmic suggestion or by themselves. More interestingly, the paper makes a conclusion that, the lower diversity reads/share behavior is mainly due to the individuals' choices.

This is valuable research due to this is the first time to reveal the detail pattern from the Facebook real-world data set. However, I against the conclusion they made, also thinking about the other potential research topic that we can pursue. Here are my reasons: 1) to account the responsibility for the user is not fair due to most of the users, they, have no clue about how the algorithm behind the system will affect their future information consumption. For example, the Facebook ranking algorithm will penalty the ranking score if you not to "like" or "share" the content you saw, say, the news articles. Hence, the news article you ignore will, slowly, disappear on your wall post. And, the mechanism is not transparent at all. The user will never know some of the content they are pre-filtered, due to some of the ignorance action they did. I argue if the "ignore" or "don't like" represent the preference of dislike for each user? 2) There is no way for the user to understand or join in the loop of algorithice processing. The user is basically followed the system suggestion or guidelines, in a very "user-friendly" and "simple" way. There should be either a way to explain or "undo" function let user can maintain the diversity content consumption, in their own will. Also, a double reminder when you decide to unfollow or dislike something, does a reminder interface require, just like when you decide to permanently delete a file from your computer? 3) The user deserves the controllability. Why there is only one personalized ranking algorithm for all different kinds of users? I think the user has the right to choose the preference they like, rather than, decided by some unknown experts or machine learning algorithms. I think this would be fair to claim, the less diverse is due to the click behavior by the users.

The three points above are the potential research topics in my point of view. If we think about Facebook ranking in a recommender system, the same discrimination or less diverse issue may also happen just under our nose. Furthermore, the potential conflict of interest the researcher from industry should be revealed. I admin the researcher from big companies with more resources to answer some of the social phenomena than a laboratory environment. For example, the Google trend for flu prediction and the Facebook ideologically diverse across real-world system users. However, the commercial companies are responsible to their stock holder, not to publics. I believe this should be the advantage for researcher in academia to play a natural role on the research subjects. This is also the value to establish a small scale system and seek for controlling experiments.


  1. Bakshy, Eytan, Solomon Messing, and Lada A. Adamic. "Exposure to ideologically diverse news and opinion on Facebook." Science 348.6239 (2015): 1130-1132. APA

Summary of Interactive and interpretable machine learning models for human machine collaboration


Summary

I found this thesis is very relevant to what I want to do in my dissertation study. This paper basically considers to fill up the gap between human and machine collaboration, based on the approach of interactive and interpretable models. More specifically, this thesis develops a framework of "human-in-the-loop machine learning" system. There are three parts to consist this paper: 1) build up a generative model for re-produce human decision process. This section is aimed to extract the relevant information from a natural human decision process, to prove the machine learning can effectively predict the humans' sequential plan; 2) a case-based reasoning and prototype classification. This section is aimed to provide a meaningful explanation to user to better engage with the system. The goal of this part is to examine the interpretability of the proposed system; 3) an interactive Bayesian case model. This section focus on an interactive Bayesian model the human or expert can contribute their knowledge or preference into the system. The author used a graphical user interface for the interaction between humans and machine, in an online educational system.

Although the thesis structure is very similar to what I want to do. However, the author focus on more in the Bayesian decision support model. All the findings and systems are based on the model. Say, how a rescue team to form up a resource allocation plan, within a sequential decision step? How the domain expert can contribute their knowledge into the model to generate a better result (model accuracy)? How the graphical interface can help user to input their feedback and get a better model result? But in my expertise, we should think about this issue in a different perspective: recommender system.

If I follow the same three layer structure, the whole idea would be: First, interaction patterns, to understand the human behavior to the machine, e.g. The recommender system with a complex machine learning or data mining techniques. I need to know how human interaction with the system and how the system can help them better fulfill the task they care about. More specifically, to test how the interpretability/transparency/explain functions are required for human to interact with the recommender system. Second, an effective system, to design a system that can help user to retrieve the useful information or suggestions. So based on the finding of the above, we can design a novel system to implement the functions we found, e.g. What kind of interpretability/transparency/explain functions that really help users to better use the recommender system. Third, a human-in-a-loop model, based on the above two findings, we know the functions are necessary and useful. Here comes a challenge to make the human can involve in the process. More specifically, an interactive recommender system that support user to contribute their preference or domain knowledge to improve the system or user experience.

  1. Kim, Been. Interactive and interpretable machine learning models for human machine collaboration. Diss. Massachusetts Institute of Technology, 2015.
  2. Explain of recommender system: a literature review.

Explanation of recommender system: a literature review


The possible research directions…  

  1. Model Effectiveness (Effective)
    1. Trust ability of the system. (Trust)
    2. Personalized result explanation (Survey & Framework)
    3. Transparent issues (Transparency)*
    4. User satisfaction(Perception)
  2. Legal and social issue
    1. Privacy
    2. Accountability of the recommendation result (Decision Support & Issues)*
    3. Discrimination (Diversity)
  3. Educational Purpose
    1. Learning the advance techniques behind recommendation.
    2. A stepwise learning model for tuning the system (Debug).
    3. Training for using the recommender system (Comprehensive).

Comprehensive

  1. Al-Taie, Mohammed Z, and Seifedine Kadry. “Visualization of Explanations in Recommender Systems.” Journal of Advanced Management Science Vol 2, no. 2 (2014).
  2. Barbieri, Nicola, Francesco Bonchi, and Giuseppe Manco. “Who to Follow and Why: Link Prediction with Explanations.” In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1266–1275. ACM, 2014.
  3. Blanco, Roi, Diego Ceccarelli, Claudio Lucchese, Raffaele Perego, and Fabrizio Silvestri. “You Should Read This! Let Me Explain You Why: Explaining News Recommendations to Users.” In Proceedings of the 21st ACM International Conference on Information and Knowledge Management, 1995–1999. ACM, 2012.
  4. Cleger-Tamayo, Sergio, Juan M Fernandez-Luna, and Juan F Huete. “Explaining Neighborhood-Based Recommendations.” In Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval, 1063–1064. ACM, 2012.
  5. Françoise, Jules, Frédéric Bevilacqua, and Thecla Schiphorst. “GaussBox: Prototyping Movement Interaction with Interactive Visualizations of Machine Learning.” In Proceedings of the 2016 CHI Conference Extended Abstracts on Human Factors in Computing Systems, 3667–3670. ACM, 2016.
  6. Freitas, Alex A. “Comprehensible Classification Models: A Position Paper.” ACM SIGKDD Explorations Newsletter 15, no. 1 (2014): 1–10.
  7. Hernando, Antonio, JesúS Bobadilla, Fernando Ortega, and Abraham GutiéRrez. “Trees for Explaining Recommendations Made through Collaborative Filtering.” Information Sciences 239 (2013): 1–17.
  8. Kahng, Minsuk, Dezhi Fang, and Duen Horng. “Visual Exploration of Machine Learning Results Using Data Cube Analysis.” In HILDA@ SIGMOD, 1, 2016.
  9. Krause, Josua, Adam Perer, and Kenney Ng. “Interacting with Predictions: Visual Inspection of Black-Box Machine Learning Models.” In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, 5686–5697. ACM, 2016.
  10. “Understanding LSTM Networks,” n.d. http://colah.github.io/posts/2015-08-Understanding-LSTMs/.
  11. Yamaguchi, Yuto, Mitsuo Yoshida, Christos Faloutsos, and Hiroyuki Kitagawa. “Why Do You Follow Him?: Multilinear Analysis on Twitter.” In Proceedings of the 24th International Conference on World Wide Web, 137–138. ACM, 2015.

Debug

  1. Kulesza, Todd, Margaret Burnett, Weng-Keen Wong, and Simone Stumpf. “Principles of Explanatory Debugging to Personalize Interactive Machine Learning.” In Proceedings of the 20th International Conference on Intelligent User Interfaces, 126–137. ACM, 2015.
  2. McGregor, Sean, Hailey Buckingham, Thomas G Dietterich, Rachel Houtman, Claire Montgomery, and Ronald Metoyer. “Facilitating Testing and Debugging of Markov Decision Processes with Interactive Visualization.” In Visual Languages and Human-Centric Computing (VL/HCC), 2015 IEEE Symposium on, 53–61. IEEE, 2015.

Decision Support

  1. Ehrlich, Kate, Susanna E Kirk, John Patterson, Jamie C Rasmussen, Steven I Ross, and Daniel M Gruen. “Taking Advice from Intelligent Systems: The Double-Edged Sword of Explanations.” In Proceedings of the 16th International Conference on Intelligent User Interfaces, 125–134. ACM, 2011.
  2. Jameson, Anthony, Silvia Gabrielli, Per Ola Kristensson, Katharina Reinecke, Federica Cena, Cristina Gena, and Fabiana Vernero. “How Can We Support Users’ Preferential Choice?” In CHI’11 Extended Abstracts on Human Factors in Computing Systems, 409–418. ACM, 2011.
  3. Martens, David, and Foster Provost. “Explaining Data-Driven Document Classifications,” 2013.
  4. McSherry, David. “Explaining the Pros and Cons of Conclusions in CBR.” In European Conference on Case-Based Reasoning, 317–330. Springer, 2004.
  5. Tan, Wee-Kek, Chuan-Hoo Tan, and Hock-Hai Teo. “Consumer-Based Decision Aid That Explains Which to Buy: Decision Confirmation or Overconfidence Bias?” Decision Support Systems 53, no. 1 (2012): 127–141.

Diversity

  1. Graells-Garrido, Eduardo, Mounia Lalmas, and Ricardo Baeza-Yates. “Data Portraits and Intermediary Topics: Encouraging Exploration of Politically Diverse Profiles.” In Proceedings of the 21st International Conference on Intelligent User Interfaces, 228–240. ACM, 2016.
  2. Szpektor, Idan, Yoelle Maarek, and Dan Pelleg. “When Relevance Is Not Enough: Promoting Diversity and Freshness in Personalized Question Recommendation.” In Proceedings of the 22nd International Conference on World Wide Web, 1249–1260. ACM, 2013.
  3. Yu, Cong, Sihem Amer-Yahia, and Laks Lakshmanan. Diversifying Recommendation Results through Explanation. Google Patents, 2013.
  4. Yu, Cong, Laks VS Lakshmanan, and Sihem Amer-Yahia. “Recommendation Diversification Using Explanations.” In 2009 IEEE 25th International Conference on Data Engineering, 1299–1302. IEEE, 2009.

Effective

  1. Komiak, Sherrie YX, and Izak Benbasat. “The Effects of Personalization and Familiarity on Trust and Adoption of Recommendation Agents.” MIS Quarterly, 2006, 941–960.
  2. Nanou, Theodora, George Lekakos, and Konstantinos Fouskas. “The Effects of Recommendations’ Presentation on Persuasion and Satisfaction in a Movie Recommender System.” Multimedia Systems 16, no. 4–5 (2010): 219–230.
  3. Tan, Wee-Kek, Chuan-Hoo Tan, and Hock-Hai Teo. “When Two Is Better Than One–Product Recommendation with Dual Information Processing Strategies.” In International Conference on HCI in Business, 775–786. Springer, 2014.
  4. Tintarev, Nava, and Judith Masthoff. “Effective Explanations of Recommendations: User-Centered Design.” In Proceedings of the 2007 ACM Conference on Recommender Systems, 153–156. ACM, 2007.
  5. ———. “The Effectiveness of Personalized Movie Explanations: An Experiment Using Commercial Meta-Data.” In International Conference on Adaptive Hypermedia and Adaptive Web-Based Systems, 204–213. Springer, 2008.

Framework

  1. Ben-Elazar, Shay, and Noam Koenigstein. “A Hybrid Explanations Framework for Collaborative Filtering Recommender Systems.” In RecSys Posters. Citeseer, 2014.
  2. Berner, Christopher Eric Shogo, Jeremy Ryan Schiff, Corey Layne Reese, and Paul Kenneth Twohey. Recommendation Engine That Processes Data Including User Data to Provide Recommendations and Explanations for the Recommendations to a User. Google Patents, 2013.
  3. Charissiadis, Andreas, and Nikos Karacapilidis. “Strengthening the Rationale of Recommendations Through a Hybrid Explanations Building Framework.” In Intelligent Decision Technologies, 311–323. Springer, 2015.
  4. Chen, Wei, Wynne Hsu, and Mong Li Lee. “Tagcloud-Based Explanation with Feedback for Recommender Systems.” In Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval, 945–948. ACM, 2013.
  5. Chen, Yu-Chih, Yu-Shi Lin, Yu-Chun Shen, and Shou-De Lin. “A Modified Random Walk Framework for Handling Negative Ratings and Generating Explanations.” ACM Transactions on Intelligent Systems and Technology (TIST) 4, no. 1 (2013): 12.
  6. Du, Zhao, Lantao Hu, Xiaolong Fu, and Yongqi Liu. “Scalable and Explainable Friend Recommendation in Campus Social Network System.” In Frontier and Future Development of Information Technology in Medicine and Education, 457–466. Springer, 2014.
  7. El Aouad, Sara, Christophe Dupuy, Renata Teixeira, Christophe Diot, and Francis Bach. “Exploiting Crowd Sourced Reviews to Explain Movie Recommendation.” In 2nd Workshop on Recommendation Systems for ℡EVISION and ONLINE VIDEO, 2015.
  8. Jameson, Anthony, Martijn C Willemsen, Alexander Felfernig, Marco de Gemmis, Pasquale Lops, Giovanni Semeraro, and Li Chen. “Human Decision Making and Recommender Systems.” In Recommender Systems Handbook, 611–648. Springer, 2015.
  9. Lamche, Béatrice, Ugur Adıgüzel, and Wolfgang Wörndl. “Interactive Explanations in Mobile Shopping Recommender Systems.” In Proc. Joint Workshop on Interfaces and Human Decision Making for Recommender Systems (IntRS 2014), ACM Conference on Recommender Systems, Foster City, USA, 2014.
  10. Lawlor, Aonghus, Khalil Muhammad, Rachael Rafter, and Barry Smyth. “Opinionated Explanations for Recommendation Systems.” In Research and Development in Intelligent Systems XXXII, 331–344. Springer, 2015.
  11. Muhammad, Khalil. “Opinionated Explanations of Recommendations from Product Reviews,” 2015.
  12. Nagulendra, Sayooran, and Julita Vassileva. “Providing Awareness, Explanation and Control of Personalized Filtering in a Social Networking Site.” Information Systems Frontiers 18, no. 1 (2016): 145–158.
  13. Schaffer, James, Prasanna Giridhar, Debra Jones, Tobias Höllerer, Tarek Abdelzaher, and John O’Donovan. “Getting the Message?: A Study of Explanation Interfaces for Microblog Data Analysis.” In Proceedings of the 20th International Conference on Intelligent User Interfaces, 345–356. ACM, 2015.
  14. Tintarev, Nava. “Explanations of Recommendations.” In Proceedings of the 2007 ACM Conference on Recommender Systems, 203–206. ACM, 2007.
  15. Tintarev, Nava, and Judith Masthoff. “Explaining Recommendations: Design and Evaluation.” In Recommender Systems Handbook, 353–382. Springer, 2015.
  16. Vig, Jesse, Shilad Sen, and John Riedl. “Tagsplanations: Explaining Recommendations Using Tags.” In Proceedings of the 14th International Conference on Intelligent User Interfaces, 47–56. ACM, 2009.
  17. Zanker, Markus, and Daniel Ninaus. “Knowledgeable Explanations for Recommender Systems.” In Web Intelligence and Intelligent Agent Technology (WI-IAT), 2010 IEEE/WIC/ACM International Conference on, 1:657–660. IEEE, 2010.

Issues

  1. Bunt, Andrea, Matthew Lount, and Catherine Lauzon. “Are Explanations Always Important?: A Study of Deployed, Low-Cost Intelligent Interactive Systems.” In Proceedings of the 2012 ACM International Conference on Intelligent User Interfaces, 169–178. ACM, 2012.
  2. BURKE, BRIAN, and KEVIN QUEALY. “How Coaches and the NYT 4th Down Bot Compare.” New York Times, 2013. http://www.nytimes.com/newsgraphics/2013/11/28/fourth-downs/post.html.
  3. Diakopoulos, Nicholas. “Accountability in Algorithmic Decision-Making.” Queue 13, no. 9 (2015): 50.
  4. ———. “Algorithmic Accountability: Journalistic Investigation of Computational Power Structures.” Digital Journalism 3, no. 3 (2015): 398–415.
  5. Lokot, Tetyana, and Nicholas Diakopoulos. “News Bots: Automating News and Information Dissemination on Twitter.” Digital Journalism, 2015, 1–18.

Perception

  1. Gkika, Sofia, and George Lekakos. “The Persuasive Role of Explanations in Recommender Systems.” In 2nd Intl. Workshop on Behavior Change Support Systems (BCSS 2014), 1153:59–68, 2014.
  2. Hijikata, Yoshinori, Yuki Kai, and Shogo Nishida. “The Relation between User Intervention and User Satisfaction for Information Recommendation.” In Proceedings of the 27th Annual ACM Symposium on Applied Computing, 2002–2007. ACM, 2012.
  3. Kulesza, Todd, Simone Stumpf, Margaret Burnett, and Irwin Kwan. “Tell Me More?: The Effects of Mental Model Soundness on Personalizing an Intelligent Agent.” In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 1–10. ACM, 2012.
  4. Kulesza, Todd, Simone Stumpf, Margaret Burnett, Sherry Yang, Irwin Kwan, and Weng-Keen Wong. “Too Much, Too Little, or Just Right? Ways Explanations Impact End Users’ Mental Models.” In 2013 IEEE Symposium on Visual Languages and Human Centric Computing, 3–10. IEEE, 2013.
  5. Valdez, André Calero, Simon Bruns, Christoph Greven, Ulrik Schroeder, and Martina Ziefle. “What Do My Colleagues Know? Dealing with Cognitive Complexity in Organizations Through Visualizations.” In International Conference on Learning and Collaboration Technologies, 449–459. Springer, 2015.
  6. Zanker, Markus. “The Influence of Knowledgeable Explanations on Users’ Perception of a Recommender System.” In Proceedings of the Sixth ACM Conference on Recommender Systems, 269–272. ACM, 2012.

Survey

  1. Al-Taie, MOHAMMED Z. “Explanations in Recommender Systems: Overview and Research Approaches.” In Proceedings of the 14th International Arab Conference on Information Technology, Khartoum, Sudan, ACIT, Vol. 13, 2013.
  2. Buder, Jürgen, and Christina Schwind. “Learning with Personalized Recommender Systems: A Psychological View.” Computers in Human Behavior 28, no. 1 (2012): 207–216.
  3. Cleger, Sergio, Juan M Fernández-Luna, and Juan F Huete. “Learning from Explanations in Recommender Systems.” Information Sciences 287 (2014): 90–108.
  4. Gedikli, Fatih, Dietmar Jannach, and Mouzhi Ge. “How Should I Explain? A Comparison of Different Explanation Types for Recommender Systems.” International Journal of Human-Computer Studies 72, no. 4 (2014): 367–382.
  5. Papadimitriou, Alexis, Panagiotis Symeonidis, and Yannis Manolopoulos. “A Generalized Taxonomy of Explanations Styles for Traditional and Social Recommender Systems.” Data Mining and Knowledge Discovery 24, no. 3 (2012): 555–583.
  6. Scheel, Christian, Angel Castellanos, Thebin Lee, and Ernesto William De Luca. “The Reason Why: A Survey of Explanations for Recommender Systems.” In International Workshop on Adaptive Multimedia Retrieval, 67–84. Springer, 2012.
  7. Tintarev, Nava, and Judith Masthoff. “A Survey of Explanations in Recommender Systems.” In Data Engineering Workshop, 2007 IEEE 23rd International Conference on, 801–810. IEEE, 2007.

Transparency

  1. El-Arini, Khalid, Ulrich Paquet, Ralf Herbrich, Jurgen Van Gael, and Blaise Agüera y Arcas. “Transparent User Models for Personalization.” In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 678–686. ACM, 2012.
  2. Hebrado, Januel L, Hong Joo Lee, and Jaewon Choi. “Influences of Transparency and Feedback on Customer Intention to Reuse Online Recommender Systems.” Journal of Society for E-Business Studies 18, no. 2 (2013).
  3. Kizilcec, René F. “How Much Information?: Effects of Transparency on Trust in an Algorithmic Interface.” In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, 2390–2395. ACM, 2016.
  4. Radmacher, Mike. “Design Criteria for Transparent Mobile Event Recommendations.” AMCIS 2008 Proceedings, 2008, 304.
  5. Sinha, Rashmi, and Kirsten Swearingen. “The Role of Transparency in Recommender Systems.” In CHI’02 Extended Abstracts on Human Factors in Computing Systems, 830–831. ACM, 2002.

Trust

  1. Biran, Or, and Kathleen McKeown. “Generating Justifications of Machine Learning Predictions.” In 1st International Workshop on Data-to-Text Generation, Edinburgh, 2015.
  2. Cleger-Tamayo, Sergio, Juan M Fernández-Luna, Juan F Huete, and Nava Tintarev. “Being Confident about the Quality of the Predictions in Recommender Systems.” In European Conference on Information Retrieval, 411–422. Springer, 2013.
  3. Kang, Byungkyu, Tobias Höllerer, and John O’Donovan. “Believe It or Not? Analyzing Information Credibility in Microblogs.” In Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2015, 611–616. ACM, 2015.
  4. Katarya, Rahul, Ivy Jain, and Hitesh Hasija. “An Interactive Interface for Instilling Trust and Providing Diverse Recommendations.” In Computer and Communication Technology (ICCCT), 2014 International Conference on, 17–22. IEEE, 2014.
  5. Muhammad, Khalil, Aonghus Lawlor, and Barry Smyth. “On the Use of Opinionated Explanations to Rank and Justify Recommendations.” In The Twenty-Ninth International Flairs Conference, 2016.
  6. O’Donovan, John, and Barry Smyth. “Trust in Recommender Systems.” In Proceedings of the 10th International Conference on Intelligent User Interfaces, 167–174. ACM, 2005.
  7. Shani, Guy, Lior Rokach, Bracha Shapira, Sarit Hadash, and Moran Tangi. “Investigating Confidence Displays for Top-N Recommendations.” Journal of the American Society for Information Science and Technology 64, no. 12 (2013): 2548–2563.