I'm trying to use WebHDFS with Azure Data Lake. According to Microsoft's documentation, the steps one should follow are:
Using the client_id, tenant_id, and secret key, make a request to the OAUTH2 endpoint
curl -X POST https://login.microsoftonline.com/<TENANT-ID>/oauth2/token \
-F grant_type=client_credentials \
-F resource=https://management.core.windows.net/ \
-F client_id=<CLIENT-ID> \
-F client_secret=<AUTH-KEY>
Upon success, you then get back some JSON including an "access_token" object, which content you should include with subsequent WebHDFS requests by adding the header
Authorization: Bearer <content of "access_token">
where <content of "access_token"> is the long string in "access_token" object.
Once you have included that header, you should be able to make WebHDFS calls, such as to list directories, you could do
curl -i -X GET -H "Authorization: Bearer <REDACTED>" https://<yourstorename>.azuredatalakestore.net/webhdfs/v1/?op=LISTSTATUS
Having followed all those steps, I am getting an HTTP 401 error when running the above curl command to list directories:
WWW-Authenticate: Bearer authorization_uri="https://login.windows.net/<REDACTED>/", error="invalid_token", error_description="The access token is invalid."
with the body
{"error":{"code":"AuthenticationFailed","message":"Failed to validate the access token in the 'Authorization' header."}}
Does anyone know what might be the problem?
I pasted the token into jwt.io and it is valid (didn't check the signature). The content is something like this:
{
typ: "JWT",
alg: "RS256",
x5t: "MnC_VZcATfM5pOYiJHMba9goEKY",
kid: "MnC_VZcATfM5pOYiJHMba9goEKY"
}.
{
aud: "https://management.core.windows.net",
iss: "https://sts.windows.net/<TENANT-ID>/",
iat: 1460908119,
nbf: 1460908119,
exp: 1460912019,
appid: "<APP-ID>",
appidacr: "1",
idp: "https://sts.windows.net/<TENANT-ID>/",
oid: "34xxxxxx-xxxx-xxxx-xxxx-5460xxxxxxd7",
sub: "34xxxxxx-xxxx-xxxx-xxxx-5460xxxxxxd7",
tid: "<TENANT-ID>",
ver: "1.0"
}.
Please click the Data Explorer button then highlight the root folder and click Access. Then grant your AAD app permissions to WebHDFS there. I believe what you have done already is just to grant that AAD app permissions to manage your Azure Data Lake Store with the portal or Azure PowerShell. You haven't actually granted WebHDFS permissions yet. Further reading on security is here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With