It occurred to me I’m not always putting my R powers to good use. So why not make a wrapper to a data API everyone can use? I’ve decided to use my relatively deep knowledge of ebay, and wrap its search API (a.k.a the Finding API) using the httr package, and here we are :)

search_ebay("elvis costume")
## # A tibble: 10 x 17
##          itemId                                                                            title categoryId
##           <chr>                                                                            <chr>      <dbl>
##  1 361600561711                                               Elvis Presley / Adult Male Costume      52762
##  2 222757368483                                                          ELVIS STYLE   TOUR BELT      52762
##  3 263200865065                          Adults Licensed Deluxe Gold Satin Elvis Presley Costume      52762
##  4 122657226058        Rubies Adult Elvis Presley Impersonator Costume Cape-One Size-Made in USA      52762
##  5 132438178557                     Elvis jumpsuit belt (professional) goes with most anything!       52762
##  6 282766777028                                            ELVIS STYLE MATT BLACK PUFFY SHIRT XL      52762
##  7 282269304226                                                 Elvis Child's Costume, Large New      80913
##  8 282766792358                                     ELVIS STYLE 50's SHIRT BLACK & WHITE - LARGE      52762
##  9 352211151488           Elvis Eagle White Cape Licensed King Of Rock Vegas Halloween Accessory      52762
## 10 222677654896 Elvis Costume 70's Style 3 Pc Wht Jumpsuit W/ Gld Sequin Trim Belt & Satin Scarf      52762
## # ... with 14 more variables: categoryName <chr>, viewItemURL <chr>, location <chr>, sheepingType <chr>,
## #   shipToLocations <chr>, isMultiVariationListing <lgl>, conditionId <dbl>, conditionName <chr>,
## #   listingType <chr>, startTime <date>, endTime <date>, watchCount <dbl>, price <dbl>, currency <chr>

You can get the package here. The first part of this post will expand on its README, detailing some more stuff you should know (though I can’t get into every single detail, the ebay ecosystem is massive). In the second part I’ll give you some ideas of what you can do with this package.

Search It For A Spin

Install it from my Github repo:

devtools::install_github("gsimchoni/ebayr")

Load it:

library(ebayr)
## Welcome to the R wrapper to the ebay Finding API.
## I see you don't have a token for this API (environment variable 'EBAY_TOK').
## Please get one at https://go.developer.ebay.com/ and set it using setEbayToken(YOUR_TOKEN).

You should see a message saying you have not set a your ebay token yet. You need to register to the ebay Developers Program here and get a token for the Finding API, for Production (not Sandbox). It takes 2 minutes. Save the token somewhere for later use and set it so that R can recognize it (I’m setting an enviroment variable here so you don’t have to input it to the function each time, though it’s a possibility):

setEbayToken("YOUR_EBAY_TOKEN")

Now Search, I said, Search!

You can search using keywords like above. You can search in a specific category or categories (a.k.a Browse):

search_ebay(categoryName = "Video Game Consoles")
## # A tibble: 10 x 17
##          itemId                                                                            title categoryId
##           <chr>                                                                            <chr>      <dbl>
##  1 192378276474 Microsoft - Xbox One S 500GB Madden NFL 18 Bundle with 4K Ultra HD Blu-ray - ...     139971
##  2 332044790216                 Sony Playstation PS Vita - New Slim Model - PCH-2006 (Aqua Blue)     139971
##  3 173037034786                                        Nintendo Switch Console with Gray Joy-Con     139971
##  4 182891810676    Sony Playstation Vita - PS Vita - New Slim Model - PCH-2006 (Aqua Blue) NEW!!     139971
##  5 202138426458               RETRO NES Nintendo MINI CONSOLE 620 Games  FAST SHIPPING FROM USA!     139971
##  6 122865047812    Microsoft Xbox One X Project Scorpio Edition - 1TB - Black Console - SOLD OUT     139971
##  7 311983051175                             Sony PlayStation 4 Pro 1TB Console PS4 Pro Brand NEW     139971
##  8 322795045685  Super Nintendo Entertainment System SNES Classic Edition Mini IN HAND SHIPS NOW     139971
##  9 292352820223                 Microsoft Xbox One S Minecraft Complete Adventure Bundle (500GB)     139971
## 10 292331271596                 PlayStation 4 Slim 1TB Console - Star Wars Battlefront II Bundle     139971
## # ... with 14 more variables: categoryName <chr>, viewItemURL <chr>, location <chr>, sheepingType <chr>,
## #   shipToLocations <chr>, isMultiVariationListing <lgl>, conditionId <dbl>, conditionName <chr>,
## #   listingType <chr>, startTime <date>, endTime <date>, watchCount <dbl>, price <dbl>, currency <chr>

You can search by both keywords and category, e.g. you want to see the game Yahtzee in both actual games and in non-fiction books, like manuals:

search_ebay("Yahtzee", categoryName = c("Nonfiction", "Board & Traditional Games"))
## # A tibble: 10 x 17
##          itemId                                                                         title categoryId
##           <chr>                                                                         <chr>      <dbl>
##  1 201620664639                                    250 Triple Yahtzee Score Sheets Pads Cards     180349
##  2 201861869361                        SALE!!!! YAHTZEE SCORE PADS CARDS  600 SHEETS YAHTZEE      180349
##  3 321852161038                                                               Yahtzee Classic     180349
##  4 201915780340           LOWEST PRICE!!!! YAHTZEE SCORE PADS CARDS  2000 SHEETS YAHTZEE GAME     180349
##  5 222388708716                                                    Yahtzee Classic Board Game     180349
##  6 162465852979                         Back to the Future Yahtzee Dice Game UPC 700304046840     180349
##  7 282775230522 New - Yahtzee Steal the Deal Dice Game by Hasbro Gaming Ages 8+ (2-5 players)     180349
##  8 322654819049                                                                       Yahtzee     180349
##  9 322923844346                                                            Yahtzee Electronic     180349
## 10 201761399117                                          50 Triple Yahtzee Score Sheets Cards     180349
## # ... with 14 more variables: categoryName <chr>, viewItemURL <chr>, location <chr>, sheepingType <chr>,
## #   shipToLocations <chr>, isMultiVariationListing <lgl>, conditionId <dbl>, conditionName <chr>,
## #   listingType <chr>, startTime <date>, endTime <date>, watchCount <dbl>, price <dbl>, currency <chr>

But this is kids stuff. So far we have accepted all of search_ebay’s defaults. Let’s search the UK site for new large size Fruit of the Loom women’s T-shirts from top rated sellers at a max price of 10 pounds!

res <- search_ebay("women's t-shirts",
                   site = "UK",
                   condition = "New",
                   listingType = "FixedPrice",
                   topRatedSellerOnly = TRUE,
                   priceRange = c(0, 10),
                   aspectFilter = list(
                     Brand = "Fruit of the Loom",
                     Size = "L"
                     )
                   )

res[, c("title", "price", "currency")]
## # A tibble: 10 x 3
##                                                                               title price currency
##                                                                               <chr> <dbl>    <chr>
##  1 Mens Ladies Womens Novelty Print TShirt Funny Tee Rude Joke Xmas Top Gift Unisex  4.99      GBP
##  2 WOMEN'S LADIES CUSTOM PRINTED PERSONALISED T-SHIRTS TEE SHIRT HEN WORK WHOLESALE  3.50      GBP
##  3       LADIES 100% COTTON T-SHIRT - FRUIT of the LOOM PLAIN T SHIRT Womens Female  2.99      GBP
##  4 Fruit of the Loom 100% Cotton Plain Blank Men's Women's Tee Shirt Tshirt T-Shirt  3.25      GBP
##  5 Men's Women's Fruit of the Loom Plain 100% Cotton Blank Tee Shirt Tshirt T-Shirt  2.95      GBP
##  6   DANGEROUS WOMAN MUSIC TOUR ARIANA GRANDE FAN TUMBLR FASHION MENS WOMENS TSHIRT  9.99      GBP
##  7  Fruit of the Loom Long Sleeve 100% Cotton Plain Blank Men's Women's Tee Shirt's  4.08      GBP
##  8                                  madness t-shirt retro style men's women's sizes  6.00      GBP
##  9    Fruit of the Loom 100% Cotton Plain Blank Men's Women's T-Shirts Value Weight  3.57      GBP
## 10 VOGUE HIPSTER SWAG DOPE CELFIE GIFT TUMBLR FASHION FUNNY MENS WOMENS TOP TSHIRT   8.99      GBP

OMG!

So, I think most of what you should know you can read in the very detailed search_ebay function (do ?search_ebay). There are a few things worth emphasizing before you continue:

  • The ebayr search_ebay function currently wraps only the Finding API and specifically the “findItemsAdvanced” call. This means you can only search for items which are live on site. If there’s a requirement I might also wrap the “findItemsComplete” call which allows you to dig in the past and find items which are no longer alive, maybe because they were sold. For more documentation of the original API see the API Reference.
  • Whatever you do, make sure you adhere to ebay’s API license guidelines. If you abuse this API, ebay could easily block all users of this package.
  • Speaking of abuse, there is a rate limit to the API of 5,000 requests per day. This is for your own good. And please notice that although you can specify nResults = 10000 to get 10K items1, the maximum results per request or page is 100. The function will simply take care of the pagination for you, calling the API 100 times.
  • Currently only searching in the US, UK, German and Australian sites is possible.
  • By default the function will return the items tibble, which will hopefully contain the relevant items with a few attributes (see below). If you specify returnAll = TRUE you can get a more detailed instance of a R6 EbayResponse class, holding also the status code for the httr::GET request and more. See ?search_ebay.
  • The API does not actually search by category name, which is not unique (e.g. there is more than a single category on the ebay US site named “Digital Cameras”!). It searches by category ID, which is unique. The categoryName parameter is there for casual beginner users, but once you gain experience you should probably specify one or more categories through the categoryId parameter. If you stick to categoryName and more than a single categoryID belongs to each of the names specified, the function will warn you this happened but will continue regardless (up to a maximum of 3 categories allowed by the API). If no category is found whose name matches the input exactly, the function will suggest a few similar ones, try and pick one. Where to find the category name or ID?
    • through the ebay site (in an item’s page, in the URL)
    • through previous results in the items tibble
    • or you might not need it, as the ebay search is quite good (e.g. shouldn’t get results for “Nikon DSLR” in “Women’s Handbags”)
  • aspectFilter: this parameter allows you to further refine your results, e.g. specifying aspectFilter = list("Storage Capacity" = "64GB") when searching for “iPhone X”, will only return iPhones with the specified Storage Capacity. Naturally these possible name-values differ from category to category and currently the best place to look for them is by looking at the left side bar in the ebay search results page. Only one “value” per “name” is supported in this function, and the “name” changes according to the ebay site you’re searching in. For example to look for black shirts in the US site you specify aspectFilter = list(Color = "Black"), in UK aspectFilter = list("Main Colour" = "Black") and in the German site (“DE”): aspectFilter = list(Farbe = "Schwarz").
  • ebay started as an auction site, but today most items have a fixed price. There are still auction items, whose price may not be determined, and you should be aware of that, specifying only listingType = FixedPrice if that’s what you’re after.
  • Speaking of price, you should be careful to look at the currency of the price in the items tibble, this needs not be in USD automatically even when searching the US site!

There is more to know. See ?search_ebay and the API documentation.

But let’s also look at what is returned in the items tibble:

  • itemId ebay’s unique item ID for each item
  • title item’s title as it appears on site
  • categoryId item’s primary category ID, note that an item can have a secondary category, not returned here, and you could search for category 1 and get also some results from category 2 because of this.
  • categoryName category’s name
  • viewItemURL item’s URL on ebay
  • location a string specifying the item’s location, e.g. “Melbourne, Australia”
  • shippingType free or some other values, see API Reference
  • shipToLocations Worldwide or other values, see API Reference
  • isMultiVariationListing see details on the hideDuplicateItems parameter
  • conditionId ebay’s ID for a condition, e.g. “1000” for “New”
  • conditionName name of condition, e.g. “New”, “Used”
  • listingType see details on the listingType parameter
  • startTime the time the item successfully uploaded to site
  • endTime the time the item is scheduled to end, this could be sooner if it sells
  • watchCount no. of users watching the item in their watchlist, could be NA for some reason, maybe when it’s zero
  • price current price of the item in the site’s currency, notice what this means for auction items
  • currency the price currency, e.g. “AUD” for Australian Dollars

Browse Inspiration

A T-Shirt cost vs. Brand

A T-shirt. For women. Size Large. How many results? (Here I use the fact I already know the category ID, see above, it’s not complicated)

res <- search_ebay(categoryId = 63869, aspectFilter = list("Size (Women's)" = "L"),
                   returnAll = TRUE,
                   verbose = FALSE)
res$nResultsAvailable
## [1] 1343681

1.3 million results. Yes, that’s ebay for you.

Let’s further limit our results by asking for fixed price items only, new items only, exclude any items with currency other than USD or items not located in the US.

Let’s look at the distribution of price for these T-shirts, by some popular brands, getting the first 100 shirts by “Best Match”, for each brand:

library(tidyverse)
library(ggridges)
library(viridis)
library(scales)

brands <- c(
  'Unbranded',
  'Gildan',
  'Brisco Brands',
  'Next Level',
  'DC Comics',
  'Fruit of the Loom',
  'Disney',
  'Anvil',
  'Batman',
  'Handmade',
  'Bella + Canvas',
  'Hanes',
  'Harry Potter',
  'Victoria\'s Secret',
  'JERZEES'
)

getPricesForBrand <- function(brand) {
  res <- search_ebay(categoryId = 63869,
                   listingType = "FixedPrice", condition = "New",
                   aspectFilter = list("Size (Women's)" = "L",
                                       Brand = brand),
                   nResults = 100, verbose = FALSE)
  res %>%
    mutate(isUS = stringr::str_detect(location, "USA")) %>%
    filter(currency == "USD", isUS == TRUE) %>%
    select(price) %>%
    unlist() %>%
    unname()
}

tibble(brand = brands) %>%
  mutate(price = map(brands, getPricesForBrand)) %>%
  unnest(price) %>%
  group_by(brand) %>%
  mutate(medianPrice = median(price)) %>%
  ggplot(aes(x = price, y = reorder(brand, -medianPrice), fill = ..x..)) +
  geom_density_ridges_gradient(rel_min_height = 0.0) +
  scale_fill_viridis(name = "Price USD", option = "C", labels = dollar_format()) +
  theme_ridges(font_size = 12, grid = TRUE) +
  theme(axis.title.y = element_blank(),
        axis.title.x = element_blank(),
        text = element_text(family="mono"),
        axis.text.y = element_text(size = 9)) +
  labs(title = "Price of Women's Large Size T-Shirts by Brand on ebay",
       subtitle = "Obtained with the ebayr package, 2017/12/19") +
  scale_x_continuous(limits = c(0, 60), labels = dollar_format())

USA vs. China

Ever get the feeling stuff from China costs much (much much) less than in the US? Let’s look for some stuff and compare. We’ll make sure we have at least 20 items for the same keywords from each country and look at the medians:

library(ggalt)

stuff <- c(
  "hulk costume",
  "elsa costume",
  "ballpen",
  "wall painting",
  "dog sweater",
  "wireless earphones",
  "ice pick",
  "iphone case",
  "256 GB memory card",
  "sunglasses",
  "keychain",
  "cocktail dress",
  "wedding ring",
  "elephant doll",
  "tuxedo",
  "cookie cutter",
  "lipstick",
  "tatoo ink",
  "disco ball",
  "measuring spoon"
)

decideLocation <- function(location) {
  if (stringr::str_detect(location, "USA")) {
    "USA"
  } else if (stringr::str_detect(location, "China")) {
    "China"
  } else {
    "Other"
  }
}

getPricesForThingInUSChina <- function(thing, minSampleSize = 20, .nResults = 1000) {
  res <- search_ebay(thing,
                   listingType = "FixedPrice", condition = "New",
                   nResults = .nResults, verbose = FALSE)
  summaryRes <- res %>%
    mutate(location2 = map_chr(location, decideLocation)) %>%
    filter(currency == "USD", location2 %in% c("USA", "China")) %>%
    group_by(location2) %>%
    summarise(n = n(), medianPrice = median(price)) %>%
    mutate(sampleLargeEnough = n >= minSampleSize)
  
  if (nrow(summaryRes) == 2 && all(summaryRes$sampleLargeEnough)) {
    return(list(China = unlist(summaryRes[
      summaryRes$location2 == "China",
      "medianPrice"]),
      USA = unlist(summaryRes[
      summaryRes$location2 == "USA",
      "medianPrice"])))
  } else {
    return(NA)
  }
}

tibble(thing = stuff) %>%
  mutate(price = map(thing, getPricesForThingInUSChina)) %>%
  bind_cols(bind_rows(res$price)) %>%
    ggplot(aes(x = China, xend = USA, y = reorder(thing, -USA), group = thing)) + 
    geom_dumbbell(color = "grey", 
                  colour_x = "indianred1",
                  colour_xend = "#0e668b",
                  size = 1,
                  size_x = 2,
                  size_xend = 2) + 
    scale_x_continuous(label = dollar_format(),
                       breaks = seq(0, 150, 30)) + 
    labs(title = "Median Price of Stuff on ebay, China vs. US",
         subtitle = "Obtained with the ebayr package, 2017/12/19, China in Red",
         x = "",
         y = "") +
    theme(plot.title = element_text(face = "bold"),
          text = element_text(family="mono"),
          axis.text.y = element_text(size = 9),
          axis.text.x = element_text(size = 12),
          plot.background=element_rect(fill="white"),
          panel.background=element_rect(fill="white"),
          panel.grid.minor=element_blank(),
          panel.grid.major.y=element_blank(),
          panel.grid.major.x=element_line(color = "grey"),
          axis.ticks=element_blank(),panel.border=element_blank())

Jeans in the USA

Finally, let’s map the price of used women’s jeans on the USA map. We’ll make sure we have at least 10 items for each state and look at the medians:

library(stringr)

getLocationUSState <- function(location) {
  stateIdx <- which(str_detect(location, str_c(",", state.abb, ",")))
  if (length(stateIdx) != 1) {
    return(NA)
  } else {
    return(str_to_lower(state.name[stateIdx]))
  }
}

res <- search_ebay(categoryId = 11554, condition = "Used", listingType = "FixedPrice",
                   nResults = 5000, verbose = FALSE)

medianByState <- res %>%
  filter(currency == "USD") %>%
  mutate(region = map_chr(location, getLocationUSState)) %>%
  group_by(region) %>%
  summarise(n = n(), medianPrice = median(price)) %>%
  filter(n >= 10) %>%
  na.omit() %>%
  right_join(tibble(region = c(str_to_lower(state.name), "district of columbia")), "region")

ggplot(data = map_data("state") %>% inner_join(medianByState)) +
  geom_polygon(aes(x = long, y = lat, fill = medianPrice, group = group), color = "white") + 
  coord_fixed(1.3) +
  labs(title = "Median Price of Used Women's Jeans on ebay by State",
       subtitle = "Obtained with the ebayr package, 2017/12/19, at least 10 items per state",
       x = "",
       y = "") +
  theme(plot.title = element_text(face = "bold"),
        text = element_text(family="mono"),
        axis.text.y = element_blank(),
        axis.text.x = element_blank(),
        plot.background=element_rect(fill="white"),
        panel.background=element_rect(fill="white"),
        axis.ticks=element_blank()) +
  scale_fill_continuous(name = "Price USD", labels = dollar_format())

Let’s see the top and bottom states:

medianByState %>%
    arrange(-medianPrice) %>%
    slice(c(1:5, 39:43))
## # A tibble: 10 x 3
##            region     n medianPrice
##             <chr> <int>       <dbl>
##  1       new york   121      29.950
##  2           utah    43      29.760
##  3     california   502      29.195
##  4      wisconsin    59      28.990
##  5        florida   402      28.140
##  6 north carolina   142      17.950
##  7       maryland    47      17.840
##  8           iowa   656      16.150
##  9       delaware    15      15.740
## 10       missouri    81      15.390

So the price for used women’s jeans in the state of New York is about twice the price in Missouri.

Remember Kids

Again I’d like to emphasize the importance of adhering to to ebay’s API license guidelines. You are now able to programmatically search through ebay! Hurray ebay, hurray httr, hurray ebayr.


  1. Why would you want that?