4 Working with json files for Network Analysis

In-Class Exercise for Week 5

Author

KB

Published

May 13, 2023

(First Published: May 13, 2023)

4.1 Getting Started

4.1.1 Install and load the required r libraries

  • jsonlite : allows the reading and importing of json files.
Show the code
pacman::p_load(jsonlite,tidygraph,ggraph,visNetwork,tidyverse)

4.1.2 Import the data

Import the given MC1.json file into R and assign the data to MC1.

Show the code
MC1 = fromJSON("data/MC1.json")

Extract the nodes info from MC1 data frame

Show the code
MC1_nodes <- as_tibble(MC1$nodes) %>%
  select(id, type, country)

Extract the edges info from MC1 data frame

Show the code
MC1_edges <- as_tibble(MC1$links) %>%
  select(source, target, type, weight, key)

Aggregate the weight information between each pair of notes and by the relationship type

Show the code
MC1_edges_aggregated <- MC1_edges  %>%
  group_by(source, target, type) %>%
  summarise(weight_sum = sum()) %>%
  filter(source !=target) %>%
  ungroup()

4.1.3 Use tbl_graph() to build tidygraph data model

We use tbl_graph() of tinygraph package to build an tidygraph's network graph data.frame.

Show the code
MC1_graph <- tbl_graph(nodes = MC1_nodes,
                       edges = MC1_edges_aggregated,
                       directed = TRUE)

Let's take a look at the output tidygraph's graph object.

Show the code
MC1_graph
# A tbl_graph: 3428 nodes and 10747 edges
#
# A bipartite multigraph with 93 components
#
# A tibble: 3,428 × 3
  id                       type         country 
  <chr>                    <chr>        <chr>   
1 Spanish Shrimp  Carriers company      Nalakond
2 12744                    organization <NA>    
3 143129355                organization <NA>    
4 7775                     organization <NA>    
5 1017141                  organization <NA>    
6 2591586                  organization <NA>    
# ℹ 3,422 more rows
#
# A tibble: 10,747 × 4
   from    to type                weight_sum
  <int> <int> <chr>                    <int>
1    49    51 family_relationship          0
2    49    52 family_relationship          0
3    49     4 family_relationship          0
# ℹ 10,744 more rows

Further data cleaning is required before we can proceed to plot the graph.