Как получить json с сайта python
Перейти к содержимому

Как получить json с сайта python

  • автор:

Запросы при помощи urllib.request и json. Практика запросов при помощи открытых API

Как упакованы данные на языке стандартных типов Python:

  • Внешняя упаковка – словарь, фигурные скобки. В нем три ключа: строки «people» (люди в космосе), «message» (сообщение об успешности передачи данных), «number» (количество людей в космосе).
  • Значение по ключу «people» – список, квадратные скобки. В списке еще три словаря: данные о каждом космонавте.
  • Внутренняя упаковка: словарь, фигурные скобки. В нем два ключа: «craft» – космический корабль и «name» – имя космонавта.
  1. сообщение успешно (success) передано,
  2. в космосе 3 космонавта,
  3. люди (people) в космосе – это: Andrew Morgan, Олег Скрипочка и Jessica Meir, все на МКС (ISS).

Программа для получения информации и ее печати на экране:

Использовано:

  • функция urllib.request.urlopen()
  • функция json.load()
  • Метод словаря get(). Можно использовать прямое обращение к словарю по ключу: astr[‘name’].

Задание: время пролетов МКС (почти) над Москвой

Задача: получить данные о времени пролетов МКС на небе Москвы. Напечатать на экране:

  1. Сколько пролетов МКС содержится в ответе (response) на запрос (request)
  2. Для каждого пролета МКС: время восхода (risetime), длительность (duration), время захода в человекочитаемом виде.

Для перевода unix time в понятные человеку день, месяц, год и часы, минуты, секунды используют библиотеку time, функцию ctime:

Начало программы, которую требуется дописать:

Решение

МКС на вечернем небе – самое яркое тело на небе после Солнца и Луны:
/>

Точное время

Задача: получить точное время и сравнить его с временем компьютера, на котором запускается программа.
Напечатать на экране:

  1. точную дату и время,
  2. разницу в минутах и секундах с временем компьютера.
  3. «На пятерку»: напечатать время рассвета и заката.

Точное время доступно по ключу «time» и представляет собой unix time, но в миллисекундах.
То есть, для получения результата в функцию time.ctime() надо передать значение, поделенное на 1000.

Для получения времени компьютера надо выполнить time.time() без параметров

Для получения времени рассвета и заката надо извлекать вложенные данные подобно матрешке: сначала обратиться к данным по ключу «clocks», потом по ключу «213» (код Москвы), потом у результата этой операции обратиться по ключам «sunrise» и «sunset».

А еще это напоминает сказку про Кощея бессмертного:

Начало программы, которую требуется дописать:

Решение

Бонусная задача: курс доллара

Задача: получить усредненный курсы рубля к доллару, евро и другим валютам.
Напечатать на экране:

  • курс рубля («RUB») к доллару («USD»)
  • курс рубля к евро («EUR»)
  • курс рубля к самой дешевой и самой дорогой валюте из представленных

Для того, чтобы узнать, сколько рублей стоит валюта, надо вычислить отношение курса рубля к курсу выбранной валюты.

Начало программы, которую требуется дописать:

Решение

Для самостоятельного изучения: справочная информация, примеры открытых API

Решение проблем

Если подключение по сети отсутствует, использовать данные из раздела «Результат запроса». Программа для задачи «Космонавты»:

How to Get JSON from a URL in Python

Being able to retrieve data from remote servers is a fundamental requirement for most projects in web development. JSON is probably one of the most popular formats for data exchange due to its lightweight and easy to understand structure that is fairly easy to parse. Python, being a versatile language, offers a variety of ways to fetch JSON data from a URL in your web project.

In this article, we'll explore how to use Python to retrieve JSON data from a URL. We'll cover two popular libraries — requests and urllib , and show how to extract and parse the JSON data using Python's built-in json module. Additionally, we'll discuss common errors that may occur when fetching JSON data, and how to handle them in your code.

Using the requests Library

One popular library for fetching data from URLs in Python is requests . It provides an easy-to-use interface for sending HTTP requests to retrieve data from remote servers. To use requests , you'll first need to install it by using pip in your terminal:

Once we have requests installed, we can use it to fetch JSON data from a URL using the get() method. Say we want to fetch posts from the dummy API called jsonplaceholder.typicode.com/posts :

We used the get() method to fetch JSON data from the URL https://jsonplaceholder.typicode.com/posts , we extracted the JSON data using the json() method, and printed it to the console. And that's pretty much it! You will get the JSON response stored as a Python list, with each post represented by one dictionary in that list. For example, one post will be represented as the following dictionary:

But, what if the API request returns an error? Well, we'll handle that error by checking the status code we got from the API when sending a GET request:

In addition to what we have already done, we checked the status code of the response to ensure that the request was successful. If the status code is 200 , we print the extracted JSON in the same fashion as before, and if the status code is not 200 we are prompting an error message.

Note: The requests library automatically handles decoding JSON responses, so you don't need to use the json module to parse the response. Instead, you can use the json() method of the response object to extract the JSON data as a Python dictionary or list:

This method will raise a ValueError if the response body does not contain valid JSON.

Using the urllib Library

Python's built-in urllib library provides a simple way to fetch data from URLs. To fetch JSON data from a URL, you can use the urllib.request.urlopen() method:

After fetching the JSON from the API URL of choice, we checked the status code of the response to ensure that the request was successful. If the status code is 200 , we extract the JSON data using the json.loads() method and print the title of each post.

It's worth noting that urllib does not automatically decode response bodies, so we need to use the decode() method to decode the response into a string. We then use the json.loads() method to parse the JSON data:

Advice: If you want to know more about parsing JSON objects in Python, you should definitely read our guide Reading and Writing JSON to a File in Python".

Using the aiohttp Library

In addition to urllib and requests , there is another library that is commonly used for making HTTP requests in Python — aiohttp . It's an asynchronous HTTP client/server library for Python that allows for more efficient and faster requests by using asyncio .

To use aiohttp , you'll need to install it using pip :

Once installed, you can start using it. Let's fetch JSON data from a URL using the aiohttp library:

Free eBook: Git Essentials

Check out our hands-on, practical guide to learning Git, with best-practices, industry-accepted standards, and included cheat sheet. Stop Googling Git commands and actually learn it!

We defined an async function fetch_json that takes a URL as input and uses aiohttp to make an HTTP GET request to that URL. We then used the response.json() method to convert the response data to a Python object.

We also defined an async function main that simply calls fetch_json with a URL and prints the resulting JSON data.

Finally, we used the asyncio.run() function to run the main function and fetch the JSON data asynchronously.

Overall, aiohttp can be a great choice for applications that need to make a large number of HTTP requests or require faster response times. However, it may have a steeper learning curve compared to urllib and requests due to its asynchronous nature and the use of asyncio .

Which Library to Choose?

When choosing a library for getting JSON data from a URL in Python, the decision often comes down to the specific needs of your project. Here are some general guidelines to consider:

  • For simple requests or legacy code: If you're making simple requests or working with legacy code, urllib may be a good choice due to its built-in nature and compatibility with older Python versions.
  • For ease of use: If ease of use and simplicity are a priority, requests is often the preferred choice. It has a user-friendly syntax and offers many useful features that make it easy to fetch JSON data from a URL.
  • For high-performance and scalability: If your application needs to make a large number of HTTP requests or requires faster response times, aiohttp may be the best choice. It offers asynchronous request handling and is optimized for performance.
  • For compatibility with other asyncio -based code: If you're already using asyncio in your project or if you need compatibility with other asyncio -based code, aiohttp may be the best choice due to its built-in support for asyncio .

Conclusion

Getting JSON data from a URL is a common task in Python, and there are several libraries available for this purpose. In this article, we have explored three popular libraries for making HTTP requests: urllib , requests , and aiohttp .

urllib is built-in and suitable for simpler requests or legacy code, while requests offers a more user-friendly and robust interface. aiohttp is optimized for high-performance asynchronous apps and scalability, and is particularly useful for applications that need to make a large number of HTTP requests or require faster response times.

How to use an API with Python (Beginner’s Guide)

RapidAPI Team

Nowadays, Python is one of the most popular and accessible programming languages. In 2019 it was ranked third in the TIOBE rating. Many experts believe that in 3–4 years it will overtake C and Java to lead the ratings.

Based on this, it would not be surprising if you use Python for your next API interaction project. In this article, we will talk about the wisdom of using the API and why Python will be a great help in this task.

What is a REST API (from a Python perspective)

An API (Application Programming Interface) is a set of rules that are shared by a particular service. These rules determine in which format and with which command set your application can access the service, as well as what data this service can return in the response. The API acts as a layer between your application and external service. You do not need to know the internal structure and features of the service, you just send a certain simple command and receive data in a predetermined format.

REST API (Representational state transfer) is an API that uses HTTP requests for communication with web services.

It must comply with certain constraints. Here are some of them:

  1. Client-server architecture — the client is responsible for the user interface, and the server is responsible for the backend and data storage. Client and server are independent and each of them can be replaced.
  2. Stateless — no data from the client is stored on the server side. The session state is stored on the client side.
  3. Cacheable — clients can cache server responses to improve performance.

A complete list of constraints you can see here.

From the Python side, the REST API can be viewed as a data source located on an Internet address that can be accessed in a certain way through certain libraries.

Types of Requests

Types of Requests or HTTP Request Methods characterize what action we are going to take by referring to the API.

In total, there are four main types of actions:

  • GET: retrieve information (like search results). This is the most common type of request. Using it, we can get the data we are interested in from those that the API is ready to share.
  • POST: adds new data to the server. Using this type of request, you can, for example, add a new item to your inventory.
  • PUT: changes existing information. For example, using this type of request, it would be possible to change the color or value of an existing product.
  • DELETE: deletes existing information

Prerequisites

In order to start working with the REST API through Python, you will need to connect a library to send HTTP requests.

The choice of the library depends on the version of Python.

If you use Python 2, we recommend using unirest because of its simplicity, speed, and ability to work with synchronous and asynchronous requests.

If you work with Python 3, then we recommend stopping the choice on requests that is the de facto standard for making HTTP requests in Python.

Further in our article we will use Python 3.6 together with the requests library. That’s how the implementation of GET request will look using the requests:

Request returns а Response, a powerful object for inspecting the results of the request. Using Response, you can examine the headers and contents of the response, get a dictionary with data from JSON in the response, and also determine how successful our access to the server was by the response code from it. In our example, the response code was 200, which means that the request was successful. We will study the response codes and their values in a little more detail.

Status Codes

Status codes are returned with a response after each call to the server. They briefly describe the result of the call. There is a large number of status codes, we give those that you will most often meet:

  • 200 — OK. The request was successful. The answer itself depends on the method used (GET, POST, etc.) and the API specification.
  • 204 — No Content. The server successfully processed the request and did not return any content.
  • 301 — Moved Permanently. The server responds that the requested page (endpoint) has been moved to another address and redirects to this address.
  • 400 — Bad Request. The server cannot process the request because the client-side errors (incorrect request format).
  • 401 — Unauthorized. Occurs when authentication was failed, due to incorrect credentials or even their absence.
  • 403 — Forbidden. Access to the specified resource is denied.
  • 404 — Not Found. The requested resource was not found on the server.
  • 500 — Internal Server Error. Occurs when an unknown error has occurred on the server.

The request library has several useful properties for working with status codes. For example, you can simply view the status of the response code by accessing .status_code:

That’s not all. You can use Response instance in a conditional expression. It will evaluate to True if the status code was between 200 and 400, and False otherwise.

In order to work with REST APIs, it is important to understand what an Endpoint is.

Usually, an Endpoint is a specific address (for example, https://weather-in-london.com/forecast), by referring to which you get access to certain features/data (in our case — the weather forecast for London). Commonly, the name (address) of the endpoint corresponds to the functionality it provides.

To learn more about endpoints, we will look at simple API example within the RapidAPI service. This service is an API Hub providing the ability to access thousands of different APIs. Another advantage of RapidAPI is that you can access endpoints and test the work of the API directly in its section within the RapidAPI service.

Let’s take for example the Dino Ipsum API. This API is used to generate any amount of Lorem Ipsum placeholder text. It is useful when you prototype or test the interface of your application and want to fill it with any random content.

In order to find Dino Ipsum API section, enter its name in the search box in the RapidAPI service or go to the “Other” category from “All Categories” list and select this API from the list. Dino Ipsum API through RapidAPI is free, so you can get as much placeholder text as you want.

Once you select Dino Ipsum API, the first page you’ll see is the API Endpoints subsection. This includes most of the information needed to get started. The API Endpoints subsection includes navigation, a list of endpoints (just one for this API), the documentation of the currently selected endpoint, and a code snippet ( available in 8 different programming languages) to help you get started with your code.

We will examine the only endpoint this API has — dinos list, which returns a certain amount of placeholder text, depending on the entered parameters. As we are practicing in Python now, we want to get a Python snippet and test it in our app. Fill in required parameters (format=text, words=10, paragraphs=1) and here is our snippet:

To use it with Python 3.6, we need to change unirest to requests. So, we get such an app:

Our app will call the endpoint, which is located at https://alexnormand-dino-ipsum.p.rapidapi.com/ and will print for us this nice placeholder text:

Craterosaurus Europasaurus Santanaraptor Dynamosaurus Pachyrhinosaurus Cardiodon Dakosaurus Kakuru Gracilisuchus Piveteausaurus.

Getting a JSON response from an API request

Often REST API returns a response in JSON format for ease of further processing. The requests library has a convenient .json() method for this case that converts JSON to a Python object.

The already familiar Dino Ipsum API will help us test this functionality. We can get JSON from it in response if we specify the format = JSON parameter when accessing dinos list endpoint.

Note that this time we did not specify the query parameters in the URL, but in the params argument of the requests.get function. Such a parameter transfer format is even more preferable.

How to Start Using an API with Python

Having dealt with the nuances of working with API in Python, we can create a step-by-step guide:

1. Get an API key

An API Key is (usually) a unique string of letters and numbers. In order to start working with most APIs — you must register and get an API key. You will need to add an API key to each request so that the API can identify you. On the example of RapidAPI — you can choose the method of registration that will be convenient for you. This can be a username, email, and password: Google, Facebook, or Github account.

2. Test API Endpoints with Python

Once we got the API key, we can refer to the API endpoints (according to the documentation) to check if everything is working as we expected. If we work with RapidAPI immediately after registering at the service, we can go to the section of needed API, subscribe to it if necessary, and test the answers of the endpoints we need directly on the API page. Next, we can generate a Python snippet that implements the functionality that we have just tested and quickly check it using IPython or simply insert it into our Python app.

3. Make your first Python app with API

After we checked the endpoints and everything works as we expected, we can start creating the application, including calls to the necessary API. As we already mentioned, RapidAPI will help us here. On the page of the API we need, we can use Code Snippet block and get Python snippet with access to the necessary endpoint. We just need to remember that if we use Python 3, we need to replace the unirest library with requests in the snippet code.

Python API Example: Earth view app with NASA API

Having in our hands the powerful features of Python and access to a wide range of APIs, we can do something great, such as exploring the depths of space or looking at Earth from orbit for a start. For such tasks, we will need NASA API, which is available through RapidAPI.

1. Get an API key

The NASA API is free, in the basic case, it does not require a special subscription. However, with intensive use of the API, you should sign up for a NASA developer key. We are not going to use it intensively, so immediately after registering with the RapidAPI service we will receive a service key and this will be enough for us. You can register by clicking on the Sign Up button in the RapidAPI menu.

As we already mentioned, you can register in any convenient way:

2. Test API Endpoints with Python

After registration, go to the NASA API page. Enter its name in the search box at the RapidAPI service or go to the “Science” category from “All Categories” list and select this API from the list.

We are interested in the getEPICEarthImagery endpoint, which gives a list of Earth pictures names taken by NASA's EPIC camera onboard the NOAA DSCOVR spacecraft. By default, it gives the most recent photos. If you wish, you can also specify the date of photos in the date parameter. We need fresh photos, so the default conditions suit us.

With the help of the Code Snippet block, we get the code we need. Since we use Python 3, we need to replace unirest library in the snippet with requests. So here’s our final snippet:

The response.json() method returns a Python dictionary with a list of the names and properties of the photos. Apparently, everything works well and we can start writing our application.
Make your first app with API

With the help of data with images of the Earth, we can create our own small application that will generate an animated image of the Earth based on the latest photos from NASA.

We will need the skimage library to resize images, as well as imageio to generate a single animated gif based on a selection of images. We can install them using the command:

We will also need the regex library to create separate variables with information about the year, month, day of the photo from the full date string. We will need these variables to form the url of an Earth photo in the format: https://epic.gsfc.nasa.gov/archive/natural////png/.png .

As a result, we get the following application:

We obtained such an image of the Earth from orbit. A good starting point for further space exploration.

Conclusion

In this article, we started using the REST API in Python and consistently walked through all the necessary steps to create a Python application that uses almost limitless opportunities of APIs.

How to get JSON from webpage into Python script

What I want to do is get the <<. etc. >> stuff that I see on the URL when I load it in Firefox into my script so I can parse a value out of it. I’ve Googled a ton but I haven’t found a good answer as to how to actually get the <<. >> stuff from a URL ending in .json into an object in a Python script.

11 Answers 11

Get data from the URL and then call json.loads e.g.

Python3 example:

Python2 example:

The output would result in something like this:

ᴍᴇʜᴏᴠ's user avatar

I’ll take a guess that you actually want to get data from the URL:

Or, check out JSON decoder in the requests library.

okoloBasii's user avatar

Jon Clements's user avatar

This gets a dictionary in JSON format from a webpage with Python 2.X and Python 3.X:

I have found this to be the easiest and most efficient way to get JSON from a webpage when using Python 3:

saravanan saminathan's user avatar

you need import requests and use from json() method :

Of course, this method also works:

json.loads will decode it into a Python object using this table, for example a JSON object will become a Python dict .

mamal's user avatar

All that the call to urlopen() does (according to the docs) is return a file-like object. Once you have that, you need to call its read() method to actually pull the JSON data across the network.

bgporter's user avatar

In Python 2, json.load() will work instead of json.loads()

Unfortunately, that doesn’t work in Python 3. json.load is just a wrapper around json.loads that calls read() for a file-like object. json.loads requires a string object and the output of urllib.urlopen(url).read() is a bytes object. So one has to get the file encoding in order to make it work in Python 3.

In this example we query the headers for the encoding and fall back to utf-8 if we don’t get one. The headers object is different between Python 2 and 3 so it has to be done different ways. Using requests would avoid all this, but sometimes you need to stick to the standard library.

Добавить комментарий

Ваш адрес email не будет опубликован. Обязательные поля помечены *