让代码更简单

解决python解析json数据含有undefined字符出错问题

重要:本文最后更新于2023-04-13 16:16:52,某些文章具有时效性,若有错误或已失效,请在下方留言或联系代码狗

昨天在写小红书的视频与图集无水印解析的时候(小红书无水印解析见抖音短视频无水印解析),遇到一个问题json数据含有undefined字符,python解析会报错,愣是花费了我几个小时。必须得记录下来,方便以后查阅。

当时取得的json数据中存在Unicode编码的字符,让我一直以为是这些字符导致的错误,然而在一些在线json格式化工具中,又没有问题,于是将错误锁定到python语言上。

测试数据如下

复制
{"user":{"loggedIn":false,"activated":false,"userInfo":{},"follow":[],"userPageData":{},"activeTabKey":0,"notes":[[],[],[]],"isFetchingNotes":[false,false,false],"tabScrollTop":[0,0,0],"userFetchingStatus":undefined,"bannedInfo":{"code":0,"showAlert":false,"reason":""}},"login":{"loginMethod":undefined,"from":"","showLogin":false,"agreed":false,"showTooltip":false,"loginData":{"phone":"","authCode":""},"errors":{"phone":"","authCode":""},"qrData":{"backend":{"qrId":"","code":""},"image":"","status":"un_scanned"}},"global":{"supportWebp":false,"serverTime":1681306563957},"layout":{"layoutInfoReady":false,"columns":6,"gap":{"vertical":16,"horizontal":32},"columnWidth":0,"showSideBar":true},"search":{"state":"auto","searchContext":{"keyword":"","page":1,"pageSize":20,"searchId":"","sort":"general","noteType":0},"feeds":[],"searchValue":""},"note":{"prevRouteData":{},"prevRoute":"Empty","note":{"type":"video","title":"绝绝子❗️❗️❗️华为是懂用户的❗️","interactInfo":{"shareCount":"10+","followed":false,"liked":false,"likedCount":"100+","collected":false,"collectedCount":"100+","commentCount":"9"},"tagList":[{"id":"641ae3e8000000000a00445c","name":"来聊聊你的副业","type":"topic"},{"id":"6016078a00000000010070e1","name":"笔记灵感","type":"topic"},{"id":"59db344d7642f1583e06a31f","name":"工作使我快乐","type":"topic"},{"id":"6145a43d0000000001007702","name":"数码达人成长计划","type":"topic"},{"type":"topic","id":"5be66abe878ba10001ed56db","name":"华为"},{"name":"华为手机","type":"topic","id":"5becef0420ebad0001445f62"},{"id":"54352e70d6e4a97deaf48432","name":"手机","type":"topic"}],"lastUpdateTime":1681223134000,"noteId":"64356ddd000000001303c0ae","desc":"[哇R]有一说一,华为这些隐藏功能,你不会还不知道吧❓简直不要太好用哇❗️[斜眼R]赶紧去看看华为手机的这些隐藏豪强功能,用起来究竟有多丝滑❗️你如果还不知道,真的就白买啦❗  ","user":{"userId":"63219e9400000000150190ee","nickname":"雷一鸣玩数码","avatar":"https:\u002F\u002Fsns-avatar-qc.xhscdn.com\u002Favatar\u002F6321a2f60145e3f7513b25e9.jpg"},"imageList":[{"traceId":"1000g0082avfjvpkh805g5op1jqa5b47e0bljc50","fileId":"a6dce28d-8242-371f-3eb7-37d93081d40e","height":1440,"width":1080,"url":"https:\u002F\u002Fsns-img-hw.xhscdn.com\u002Fa6dce28d-8242-371f-3eb7-37d93081d40e"}],"video":{"image":{"firstFrameFileid":"110\u002F0\u002F01e4356dbf79149c001000018770b54814_0.jpg","thumbnailFileid":"110\u002F0\u002F01e4356dbf79149c001000018770b572b4_0.webp"},"capa":{"duration":151},"consumer":{"originVideoKey":"pre_post\u002F1000g0cg2avf7jq8gm06g5op1jqa5b47e1gqvshg"},"media":{"videoId":136292634208048290,"video":{"bizName":110,"bizId":"280407822784512174","duration":152,"md5":"f7b65c1eac1a8c2869606588e8dbcba1","hdrType":0,"drmType":0,"streamTypes":[259]},"stream":{"h264":[{"duration":151255,"videoDuration":151233,"rotate":0,"audioBitrate":64058,"vmaf":-1,"psnr":0,"weight":62,"height":960,"backupUrls":["http:\u002F\u002Fsns-video-qc.xhscdn.com\u002Fstream\u002F110\u002F259\u002F01e4356dbf79149c010370038770b66cd0_259.mp4?sign=333ba02a310efb2c70abc5a67038805d&t=643825d4","http:\u002F\u002Fsns-video-hw.xhscdn.com\u002Fstream\u002F110\u002F259\u002F01e4356dbf79149c010370038770b66cd0_259.mp4","http:\u002F\u002Fsns-video-al.xhscdn.com\u002Fstream\u002F110\u002F259\u002F01e4356dbf79149c010370038770b66cd0_259.mp4"],"hdrType":0,"width":720,"audioCodec":"aac","qualityType":"HD","audioDuration":151254,"masterUrl":"http:\u002F\u002Fsns-video-bd.xhscdn.com\u002Fstream\u002F110\u002F259\u002F01e4356dbf79149c010370038770b66cd0_259.mp4","streamType":259,"format":"mp4","size":8255316,"volume":0,"avgBitrate":436630,"streamDesc":"WM_X264_MP4","videoCodec":"h264","ssim":0,"fps":30,"audioChannels":2,"defaultStream":0,"videoBitrate":366069}],"h265":[],"av1":[]}}},"atUserList":[],"time":1681223133000},"volume":0,"currentTime":1681306563985,"comments":{"list":[],"cursor":"","hasMore":true,"loading":false},"commentValue":"","commentAt":[],"commentSelectionStart":0,"commentSelectionEnd":0,"mediaWidth":450,"noteHeight":800,"showRedmoji":false,"commentTarget":{},"noteContentHeight":0,"metricsReportMetaData":{"currentReportType":"enter","startReadNoteClientTimeStamp":0,"noteStaySeconds":0,"isCommentCurrentNote":false,"isLikeCurrentNote":false,"isCollectCurrentNote":false,"isFollowCurrentNoteAuthor":false},"isImgFullscreen":false,"noteId":"64356ddd000000001303c0ae","gotoPage":"","commentNickName":"","firstNoteId":"64356ddd000000001303c0ae"},"feed":{"query":{"cursorScore":"","num":30,"refreshType":1,"noteIndex":0,"unreadBeginNoteId":"","unreadEndNoteId":"","unreadNoteCount":0,"category":"homefeed_recommend"},"feedBizFmp":{"fmp":0,"fmpWithImg":0},"isFetching":false,"isError":false,"feedsWrapper":undefined,"feeds":[],"currentChannel":"homefeed_recommend","unreadInfo":{"cachedFeeds":[],"unreadBeginNoteId":"","unreadEndNoteId":"","unreadNoteCount":0,"timestamp":0},"preloadSuccess":false,"preloadConfig":{"usePreload":false,"cacheExpires":259200000,"checkCache":false},"validIds":{"noteIds":[]},"mfStatistics":{"timestamp":0,"visitTimes":0,"readFeedCount":0},"channels":undefined},"redMoji":{"mojiData":{"version":"","redmojiTabs":[],"redmojiMap":{}}}}

使用json.loads方法将json字符串转成对象,出现json字符串错误的提示。

解决python解析json数据含有undefined字符出错问题

最后才发现是因为其中的undefined字符引起的问题,因为python中没有undefined的定义,所以只要将undefined替换成null就能正常转换了。

复制
json_data = re.sub(r'undefined', 'null', json_str)

使用了正则,需要引入re库。

复制
import json,re

感觉很棒!可以赞赏支持我哟~

1 打赏

评论 (0)

登录后评论
QQ咨询 邮件咨询 狗哥推荐