by Nora Al Haider
(Originally published on Legal Design & Innovation medium publication)
This piece documents an intervention aimed at improving access to justice outreach. We used a technical intervention — standardized data markup on legal help websites — to improve people’s discovery of key, public-interest legal information.
Intended Goal: Our goal was to help more people find out key legal rights & services when they searched online for help. Search engines’ results pages are a key place where people find out about legal help. We wanted to raise the placement of local, public interest legal help sources on Google search results pages. Ideally, a person searching with a problem like ‘eviction help’ or ‘how to get a restraining order’ will see local legal aid groups that can help them resolve their problem, or legal guides from their jurisdiction that can build their legal capability
Our Intervention: We used website structured data markup, called Schema.org, to improve how search engines found & displayed legal aid websites to people searching online.
Our Team: Our team at Stanford Legal Design Lab partnered with local legal aid groups across the US, with support from the Pew Charitable Trusts & the Legal Services Corporation.
Schema Markup Intervention
In the early months of 2020, our team worked with stakeholders in Florida, Hawaii, New York, Alaska, and Idaho to implement an intervention that could improve the placement of legal aid and legal help websites in search results. We created schema markup for our stakeholder organizations’ websites, with the expectation that this structured data could help their public interest organizations show up higher on search result pages, and could match them with more appropriate users of search.
The non-profit Schema.org consortium of major web search providers, including Google, Bing, Yahoo, and Yandex, sets the standards for the markup, and also uses it with its search engines. Schema markup is a way that websites can communicate to search engine crawlers and bots. It is JSON code that is put on the back end of an individual website page, which indicates information about the organization that runs the page as well as the services and content that is on the page itself. It uses standard markup language, that all of the major search engine providers agree on, to communicate these key pieces of information to the search engines automatically.
Markup is often used by commercial businesses in their search engine optimization (SEO) strategies. In addition to other SEO techniques, like having more websites linked to your website, creating content that matches with common search terms, and having fast loading and mobile responsive pages, schema markup and other structured data techniques can help an organization get more prominence in people’s search results.
One of the main issues with Schema markup is the high barrier to knowing how to use it for a particular type of organization. The markup itself is a collection of terms that can be defined however any given webmaster or content developer wishes to. It is free to use, but takes a great deal of expertise to understand how to use it correctly for a certain type of organization, website, and set of services.
Steering group of stakeholders, on legal help markup protocols
For this reason, our stakeholder group focused on developing a standard protocol for legal help organizations who wanted to mark up their sites with schema markup. Over the course of six months, our lab team drafted and revised a standard way to use the markup for legal aid organizations, statewide legal help portals, and other legal information websites. We did this with constant feedback from these organizations’ webmasters, lawyers, and content developers. We held twice-monthly phone calls and Slack office hours, in which we presented our proposed markup protocol for these stakeholders’ revisions and approval. Their input helped us to understand:
- What types of organizational details, services, and site content they wanted to make sure that people searching online should find, and what information they would rather not be shown publicly (or shared widely).
- What the right ‘unit’ for sections of markup should be: like describing all services around the main organization, or describing the services that each office of the organization provides, or describing the services that each unit of the organization provides. We ended up with a ‘graph’ model of the legal schema: describing the main organization, then describing each service it provides, and linking them together using key ids that made it clear that the organization provided the particular services.
- How to actually implement the schema on their websites and content management systems, to get the JSON code onto their homepages without disrupting their current codebase and SEO strategies.
Our markup covered the pieces of information about the organization, as well as its general legal aid or legal help services. In our first round of markup, we decided not to mark up events, one time services, or other pieces of information that were likely to change quickly. Our stakeholders warned us that they often don’t take down stale information on clinics or events, so we may inadvertently display events to users that no longer take place (like bi-weekly office hours on eviction help, that may have been halted several months ago). They explained that their events are often funding-contingent, so when a 12- or 18-month grant ends, the event service ends but they do not necessarily update the website with this information.
Instead of events or other information that changes frequently, we focused on the stable information about the website, like the organizational details, contact information, jurisdiction, areas of issue expertise, price range, and type of organization. This organization-level markup may have an effect on these organizations’ sites being displayed more prominently, potentially with a map, business hours, description, and contact details shown on the results page. It was not page-level markup, that would be more likely to have specific sections of content, FAQs, or step-by-step how-to’s appear on the results page. This first round of schema markups is available for public review.
By Spring 2020, we had an agreed-upon set of terms and ways of laying out schema markup for legal help sites. Our Lab handcrafted markup for our partner organizations, following the protocol. We gave the seven legal help organizations the text files of the markup for them to apply on the back end of their main home page. Because each group had different content management systems, there was wide variation across our cohort surrounding how easy or hard it was to implement the markup on their website. Typically, markup is implemented by manually inserting it into the backend code of a website, like in homepage’s header code. In some cases, legal aid groups use other search engine-related apps on their content management systems that interfere with new markup being added. We had intended that all sites would implement it at the same time so that we could have regular check-ins to see any effects or watch for effects of possible external variables like new policies on unemployment or rental housing for COVID. But there were gaps of several weeks and months between the different sites’ schema implementation.
Expected impact of the markup intervention
Our expectation was that the schema intervention would increase the traffic of jurisdiction-correct visitors to the website. There have not been controlled studies of precisely how markup may increase overall traffic, or jurisdiction-correct traffic. It was unclear whether overall traffic volume would go up. Instead, the goal was to increase the jurisdiction-specific traffic of people from the state or the region that the organization serves. That can sometimes be seen through Google Analytics, which shows the state or region from which visitors are apparently located. (Though, it is important to note, that people using various browsing tools may obscure their location or prevent tracking, for the sake of privacy). We also expected to increase the number of visitors who were interested in the subject matter of the organization, as shown by the time they spent engaging with the site rather than ‘bouncing’ (or leaving quickly, apparently because the site did not match their intent). We wanted to find people whose search queries’ intent matched the services and information on the websites with the markup.
This first round of the schema intervention was also meant to be a learning experience in developing the intervention itself. As there is no official protocol on how to use the myriad schema markup terms to represent a given group — particularly a non-profit group (as most markup so far has been done by commercial groups, and examples are geared towards businesses) — this experiment was meant to to identify bugs, develop a common protocol, and begin to see how much of an effect on traffic structured data markup could have. As we created and implemented the markup, the goal was to learn 3 main things:
- How to automate the creation of markup so that more organizations could do it quickly if not automatically.
- How best to do more detailed markup, that would even better specify key terms about service, issue expertise, and jurisdiction, to improve how search engines showed information about these sites. In the first round, we kept a fairly short list of general, site-wide information to mark up, but in future rounds we hope to mark up more of the websites’ individual pages, guides, hotline, clinics, and other services they offer.
- Whether the hype around schema markup’s ability to improve site’s rankings translated into real impact. Could markup increase how often sites appeared in the top 10 or top 3 search results? Could it increase the number of jurisdiction-correct visitors, who would spend time on the site to make use of its resources? If it did not have substantial impact, this would change our strategies, to possibly focus on more engagement with policy leaders at technology companies, about how to improve public interest organizations placement, aside from structured data.
Once the seven organizations applied schema markup to their websites, we then used a combination of individual site analytics and post-markup search engine audit to analyze these three areas of inquiry. The search engine audit helps us to see what people are being shown by search engines, when they search for common legal problem terms. The site analytics help us to see who is finding their way to public interest legal help sites, what queries or referrals brought them to the sites, and how they behave when they are on the sites.
Website Analytics & Search Audit: What was different, what did the intervention show?
After implementing the Schema markup intervention on the seven websites, our team led an evaluation of what the sites’ Google Analytics showed in terms of impact before and after the markup. This is in addition to a post-markup audit of search engine results, to see if the public interest sites appeared more frequently or in higher positions after they had markup on their site.
Whereas the search engine audit informs us about what sites people are shown as likely relevant to their question, website analytics inform us about how many people end up coming to a website to find help for their problems. We use both measures to see how markup affects how search engines display listings of sites, and how users ultimately behave in choosing sites to visit and spending time on them.
Google Analytics provides an overview of website traffic. It includes reports and analytics on traffic sources, locations, demographics and behavior of the audience. We used the Google Analytics data to analyze the effects of the Schema markup intervention, and the impact of COVID-19 on the aforementioned markup intervention.
How soon does markup make a difference?
Generally, the effects of schema markup cannot immediately be evaluated after they’ve been implemented on a website. There are varying online reports, and no clear answers, but according to most informal message board threads, the effects of markups are visible on search engines and analytics pages anywhere between 1 week and 1 month. It may take a while because of the cycles on which search engines’ crawler bots search the Internet to index new sites, content, and markup. They may be delayed in ‘finding’ the markup, if they only visit the site once a week or once a month to look for updates.
In our project, the schema markups for the legal aid and service organizations were all implemented on their websites on different dates. In order to provide a valid evaluation, we had to ensure that enough time had lapsed between the implementation of the markup and the analysis.
Schema Implementation Records
Organization & Schema markup implementation date
Idaho Legal Aid: 5/6/2020
Legal Services of North Florida: 5/11/2020
Law Help New York: 5/29/2020
Legal Aid Hawaii: 6/2/2020
The above info provides an overview of the implementation dates. The analysis was conducted on the websites’ analytics at the end of August 2020.
How are the sites performing?
Two main sources were mined for data: Google Analytics and Google Search Console. It is important to distinguish these two tools. Google Analytics provides more insight about the visitors, page visits, and usage time of the webpage. It tracks and reports different segments of website traffic. Google Search Console, on the other hand, provides more insight on the organic traffic search results. Organic search traffic indicates the visitors that visit the website through search engines. We used both of these data sources to analyze the impact of the schema markups.
Before delving into the results there are some measurement issues and important assumptions that may have impacted our results in visible ways. These issues will be discussed more fully in the following paragraphs.
- Different implementation dates: The implementation date is different for each of the legal aid and service websites. Although we left the minimum required time (1 week) between each implementation and the start of the analysis, there are some unknown aspects regarding the detection of the markup by search engines. This might have created some variances in the analysis.
- Missing data: There was the issue of missing data for some of the legal aid websites. Florida Law Help switched ownership in early 2020. Google Analytics was disabled when it was moved to a different platform, therefore there is missing data in the months of April, May and June 2020.
- Correlation/Causation: Although some of the data suggests that there is a possible correlation between the schema markup and the several of the metrics we analyzed, this does not necessarily indicate causation.
Results: Did Schema improve visitors and quality matches?
We looked at four separate metrics to determine if and how schema markup changed the legal help websites’ discoverability: visitor count, traffic sources, session duration, and click-through rate. We recognize that particularly in a year with COVID-related upheavals, people may have changed their search behavior, and that this could confound our reliance on analytics. Ideally, in future years with fewer emergency events, there can be further study of website analytics before and after the implementation of markup or other SEO strategies.
Metric 1: Visitor Count
When thinking about impact, increasing the number of visitors to a website may seem to be the most important metric. Yet we know from our discussions with legal aid experts that they are most interested in increasing the number of ‘appropriate’ visitors — those from their service area and who have legal help queries — even if that means a decrease in overall visitors.
Visitor count can also be a problematic metric for measuring the effectiveness of a particular intervention, like schema markup. The visitor count metric will be affected by other variables aside from markup — like policy changes, economic downturns, natural disasters, and other events that may affect who is searching for legal issues. We can expect that a large swing in visitor count is likely due to an annual or special legal event. Theoretically, comparing the visitor count trend for 2020 with the trend for 2019 can help separate out the seasonal changes in visitor count from the effect of schema implementation, but because of the unique legal circumstances in 2020 (i.e. COVID lockdown, economic downturn, and related legal issues around housing, benefits, unemployment, family, schooling, etc.), such a direct comparison was not possible either.
Therefore we analyzed 3 other metrics, in addition to visitor count: traffic sources, session duration, and click-through rate.
Metric 2: Traffic Sources
We compared where sites were getting user traffic from. In particular, we compared direct traffic versus search traffic. Direct traffic means users are arriving at the website directly through a link or a bookmark, while search traffic means users are arriving through searching through Google. If an increase in visitors was caused by schema, we would see an increase in search traffic only, but if the increase was caused by current events then it is likely that both direct and search traffic will increase.
As the column chart indicates, there is a spike in visitor count for search traffic in the month of July after schema was implemented on 5/29/2020. This means that there was an increase in visitors who visited the website through organic sources, such as search engines. A possible explanation for this spike could be that rich results and better ranking on search engine result pages generated by schema might have established an environment that produced more clicks in the month of July.
Figure 1: For NY Law Help column chart depicting direct and search traffic per month for the year 2020
Metric 3: Session duration
The metric of ‘session duration’ is significant for our analysis, as an indicator of a good match between the site’s content and the user’s intent. The schema implementation aims to increase not just the volume of search traffic to legal help websites, but good matches to the person’s actual needs. If there is an increase in session duration, then this might suggest that the website is attracting more people who find value in the website’s information and are willing to spend more time engaging with it. The visitor session duration metric can also tell us if an increase in visitor count is due to the website being discovered by people who really need legal help from that state, or if it is due to the website being suggested to people who click on it but then realize it is irrelevant to their situation.
For Idaho Legal Aid, we saw a very drastic increase in page views per session for visitors who were on the website for 30+ minutes. This might suggest that the website is attracting more people to whom the website’s content is relevant.
At first glance, it seemed that Legal Aid Hawaii also enjoyed an increase in visitor count, but it turned out that the gain was due to an increase of visitors who stayed for 0–10 seconds. These visitors are more commonly known as bouncers. Law Help NY also saw an increase in bouncers, but there was also a slight increase in the number of visitors who visited for 10 seconds — 10 minutes.
Figure 2: The Idaho Legal Aid website saw an increase in pageview per session after the schema markup implementation.
Metric 4: Click-through rate (CTR) and search ranking
The last metric that can be useful is the click-through rate and search ranking. Indeed, this might be the most appropriate metric to evaluate the markup’s effectiveness, since Google communications claim that schema markup will improve search ranking. Of course, whether that improved search ranking means that more people will click on the search result and interact with the website is a step removed from improved ranking. Florida Law Help had incorrectly implemented their Google Analytics, but they had provided more extensive Search Console data than the other websites. The click-through rate generally increased after schema implementation. Search rank was on the decline until the markup was implemented — then, the website rank steadily started to improve.
Figures 3 and 4: click-through rate and search ranking improvement for Florida Law Help after the schema markup implementation.
Searchers for the organization versus the issue
Two good indicators for whether schema would be effective for a given site are: 1) the proportion of search traffic to direct traffic and 2) the proportion of clicks after the user searches the exact name of the organization. We chose these indicators because markup would only affect a person’s online behavior if they were searching help through a search engine instead of going to the website directly. Also, markup-driven increased search placement or rich snippets are more likely to affect a user’s chances of clicking on a website if they were looking for general legal advice. Users who were already looking for a specific organization’s website will look for and click on the organization’s website regardless of whether there’s a rich snippet or higher placement.
Around 50% of Legal Services of Northern Florida’s traffic was direct, which means that most visitors of their website never got to see the rich snippets at all (since schema implementation adds rich snippets to search results). For both LSNF and Hawaii, 34% and 18% of visitors respectively had typed in the exact name of the organization in the search bar. This suggests that a good portion of visitors to these sites were already looking for the organization’s website, so a rich snippet or other special search engine treatment would not increase their chances of clicking on their results.
Comparatively, only 5% of Idaho’s visitors had typed in the exact name of the organization. Since most of Idaho’s visitors are people searching more generic help terms (like “tenant eviction help idaho,” instead of “idaho legal aid”), the rich snippets can really help draw attention to the site. A legal aid group that wants to increase its search traffic could use schema markup to connect with people searching for their problem issues, rather than for a local legal aid group.
Figure 5: LawHelp Minnesota’s traffic source is mostly search traffic. This signals potential for schema markup implementation.
Confounders of pre- and post-markup search analytics
We recognize that other factors aside from the application of schema markup may have affected the websites’ analytics.
One set of confounding variables are content and technical changes on the websites we are working on. For example, session duration could have also increased due to technical changes on a website. A more complex layout, for example, often means that the website is difficult to navigate. This could be a potential factor that increases the session duration. After a check-in with the public interest groups, we found out that none of the layouts have changed since the implementation of the markup and we could therefore proceed with using the metric in our analysis.
A second set of confounders are around seasonal changes. People’s legal help situations may differ throughout a given year, rising and dipping with different seasons and regular events. Financial legal needs may spike around tax season in the early months of the year, or during the holidays when more money is being spent. Educational problems may rise before and during the first months of a school year. Search traffic may be higher in certain times of the year, and thus there might be changes in search traffic during the analysis period that is based more on changing legal needs than the markup intervention.
A more difficult set of confounding variables emerged with COVID-19. The health and economic emergencies led to both new legal protections, and more people facing legal help situations like needing to file for unemployment benefits, avoid eviction for nonpayment of rent, deal with domestic violence threats, and deal with contracts affected by pandemic restrictions. These new protections and problems likely changed people’s search behavior. We might assume that the pandemic increased the levels of searches for legal help, and also introduced new types of searches (like, ‘is there an eviction moratorium in my state’ or ‘how do i claim my stimulus check’). This changing search behavior presented a major confounding variable for our intended pre-/post-study of the markup’s effects on legal help search.
At first, we thought that we would wait out the emergency period, and run the intervention once it ended. Once we realized that the emergency would be continuing indefinitely, we considered how to incorporate the data fluctuations into our analysis. We adjusted our evaluation, to look not only at pre- and post-markup metrics (both in the year 2020), but also comparing the same time period to the previous year. This method allowed us to spot fluctuations that were likely caused through changed search behavior relating to the pandemic, as opposed to seasonal confounding variables that would occur each year. Moreover, as we noted above, we decided to use multiple website analytics metrics in addition to the visitor count, which would be more susceptible to fluctuations due to current events, such as Covid-19.
We tried to use these techniques to minimize the effects of Covid-19 on our data analysis. But there is no guarantee that we managed to fully eliminate the impact that Covid-19 had on the data analysis, and we recommend that future studies in a more ‘stable’ year will be needed to measure markup’s impact.
Search Engine Results Page Audit, round 2
After implementing the schema markups on our partners’ sites, we conducted a second round of a SERP audit. This second audit was to determine if, after the markup implementation, we saw any differences in how public interest legal aid sites’ placements. We compared the first SERP audit’s results with the second’s, to see if overall three was any change in domain types, and for specific websites’ performance.
One change we observed was a decrease in the amount of .org domains in Northern Florida-based searches.
Figure 6: Search Audit, Round 2: Florida Domain Suffix percent breakdown
We had put markup for two legal help websites with .org domains that try to serve users with legal help queries: https://www.lsnf.org/ ; https://floridalawhelp.org/. Our assumed outcome is that these .org sites would appear more frequently based on the markup, intervention but this did not occur. Interestingly, there was an increase in .gov links that appeared in the results. This was most likely not due to our schema intervention, though. Our markup was only added to organizations at .org domains, and not with .gov domains. The increase in .gov links could be explained by the Covid-19 information and help pages set up by governmental institutions. Their improved website offerings may have increased their search placement.
Figure 7: Search Audit, Round 2: Hawaii Domain Suffix percent breakdown
Hawaii’s second SERP audit illustrates a similar trend. There is a slight increase in .gov domain extensions, but as noted above, the stakeholders that participated in this research project did not have a .gov domain extension. The slight increase in both .gov and .com domains can potentially be explained by the Covid-19 pandemic that created a surge in both searches as well as the creation of help and information pages regarding the pandemic.
Figure 7: Percentage of non-US domain names
There also hasn’t been a significant change in the percentage of non-US domain names. This percentage seems to be nearly the same in November as in January of this year.
The lack of development and changes in the search audits pre- and post-schema markups may be explained by two factors. The first factor is the limited duration of the schema implementation on the websites. As noted earlier in this chapter, most stakeholders implemented the markup in the early summer months. There is no clear explanation from search companies or research data on how long it takes before the effects of the markup can be seen on the result pages of search engines. It might take several more months before we can see the markup being recognized and understood by the search engine crawlers and subsequent ranking algorithms. The markup merely speaks to any crawlers about what is on the page, but there is no documented, predictable process about how the search engine teams and algorithms crawlers take the markup information and use it to affect the search results. There is also the possibility of advocacy to technology companies, to have their search teams pay particular attention to this markup so that they track and evaluate internally how well they are serving legal help searchers.
The second factor is more technical. The schema markups have mainly been implemented on the mainpage of the organizations. However, it might be possible that the benefits of schema are mostly gained when the markup is implemented on the specific pages of the organization’s website. For example, describing that Legal Services of Northern Florida is a legal aid organization that serves the Pensacola and Jacksonville jurisdictions on the main page may not be sufficient. Our first round of markup interventions, focused on organization markup, may not be nearly as effective as issue- and service-oriented markup. Future interventions and analysis should concentrate on adding markup to their specific hotline, guides, and other content on specific issues — like for unemployment, veterans, landlord-tenant, or domestic violence. The main page often embeds general information about the organization, but that’s not necessarily what a user needs. Users are often looking for information on a specific problem and are thus more likely to click on a link on a search engine results page that provides a snippet with specific information. They would be searching for their problem, and not for ‘legal aid group near me’. This means that in future iterations of this project, more attention has to be given to schema markups for specific legal information pages.
Did the Schema.org intervention change the legal help sites’ performance?
Because of the confounders, particularly around COVID-19’s effects on search behavior, we cannot make strong conclusions about schema markup’s effects on traffic to public interest sites. We can use the pre- and post-markup analytics to observe some changes that indicate for some sites experienced higher numbers of visitors and improved click-through rates. But not all sites experienced these increases. This 2020 analysis suggests that there might be value in schema markup to increase search rank, but marking up at the organizational level does not produce a ‘markup bump’.
Speaking with our legal aid practitioners and search engine contacts about these results, some alternate proposals emerged. In future interventions and evaluations, there might be a higher increase in traffic if more of the sites’ individual webpages and services are marked up (not just the general organization). In this way, their specific guides, articles, FAQs, hotlines, and clinics on issue areas like renting, debt, domestic violence, etc. may be communicated to search engine bots. These bots can recognize that these individual pages have content that can help people searching for problems in this area, and then do a better job in matching searchers to this specific page (not the homepage) that can help them with their query.
The results indicate that there might be value for some sites in using markup to increase traffic, it is important to look at the schema markups in more relative terms. Is this intervention, of developing and applying structured data on a website, meaningful enough for wide application in the public interest legal help community? How much does the schema markup improve the search engine results and increase its amount of targeted visitors?
Exact understanding of markup’s impact is difficult to define. Google’s search team and its algorithm do not list conclusively how their algorithm uses schema markup in its search results, and how this differs for particular areas of questions (like for legal help). Even though Google encourages sites to use markup to improve their search placement, there are many other search engine optimization techniques that sites may use, such as mobile-responsiveness of the websites, user-friendly content, search experience of individual users, page-loading times, preventing spam on webpages, etc. Even if a legal aid site has markup, other sites who are using other SEO techniques may still place above them. Implementing schema may increase the likelihood that a site’s content may appear in rich Google search results or better targeting of results to user’s queries, but does not guarantee either.
In the future, Google’s search engine may treat the sites differently if the markup specifies that they are authoritative — with a government designation, public interest credentials, or another standard that is set as authoritative. This would depend on the legal community’s ability to decide what a marker of authority could be, and then discussions with search companies to make them aware of these markers and why they should be considered in their algorithm. For example, a coalition of courts and legal aid groups could identify what makes their sites more authoritative and beneficial to people seeking legal help. They could agree on how to use schema markup terms to communicate their public interest status. Then they could convey this work to the search companies, and attempt to inform them about how they are using markup to designate their authority and why the search algorithm should pay regard to this part of the markup.
Image: An annotated version of our schema markup, that was revised after our initial implementation, to better allow for building onto it with individual pages + articles marked up as well as the main organization and legal aid services.
When we are discussing future work and next steps, it is important to realize that the schema markups are just one part of a larger set of SEO techniques and search engine policy decisions. Markup will not be the sole solution to improve search results, create rich snippets and other special treatment on the results page, and drive more traffic to public interest legal help sites. Equal attention has to be given to creating user-friendly content, easily accessible web layouts, fast loading webpages, compatibility with various devices and browsers, etc. Last but not least, another important factor is setting up partnerships with tech companies. These companies can not only provide public interest groups with the correct information on how to adjust their websites and markups, but there is opportunity for more conversation on why there is a need for legal help snippets and better search engine page results. It is only through a combined effort by both parties that we can increase the accessibility of online legal information.
A big thank you goes out to our research assistants, Yue Li and Julia Park, and all of the wonderful pioneering stakeholders who participated in this research project: thank you!