且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

oracle - 将许多日期格式转换为一个格式化的日期

更新时间:2023-10-04 23:39:34

如果您对所有可能的日期格式化它可能更容易使用暴力:

$ p $ 创建或替换函数clean_date
(p_date_str in varchar2)
返回日期

l_dt_fmt_nt sys.dbms_debug_vc2coll:= sys.dbms_debug_vc2coll
('DD-MON-YYYY','DD-MON-YY','DD-MM-YYYY' ,MM-DD-YYYY,YYYY-MM-DD
,DD / MM / YYYY,MM / DD / YYYY,YYYY / MM / DD YY','MM / DD / YY');
return_value date;
begin
for idx in l_dt_fmt_nt.first().. l_dt_fmt_nt.last()
loop
begin
return_value:= to_date(p_date_str,l_dt_fmt_nt(idx)) ;
出口;
例外
其他则为空;
end;
结束循环;
如果return_value为null,则
提高no_data_found;
end if;
返回return_value;
exception
no_data_found then
raise_application_error(-20000,p_date_str ||'是未知的日期格式');
结束clean_date;
/

请注意,现代版本的Oracle对日期转换非常宽容。这个函数处理不在列表中的格式的日期,并带来一些有趣的结果:

  SQL>从dual中选择clean_date('20160817'); 

CLEAN_DAT
---------
17-AUG-16

SQL>从dual中选择clean_date('160817');

CLEAN_DAT
---------
16-AUG-17

SQL>

这证明了面对松散的数据完整性规则时自动数据清理的限制。罪的工资是损坏的数据。




@AlexPoole提出了使用'RR '格式。日期掩码的这个元素是作为Y2K kludge引入的。令人沮丧的是,我们仍然在讨论近二十年的新纪元。

无论如何,问题是这样的。如果我们把这个字符串'161225'这个字符串转换成什么样的世纪呢?那么,'yymmdd'会给 2016-12-15 。够公平的,但是'991225'?我们真正想要的日期有多少可能是 2099-12-15 ?这是'RR'格式的作用。基本上,它默认的世纪:数字00-49默认为20,50-99默认为19.这个窗口是由2000年的问题确定:在2000年更可能是 '98 提到最近的过去,不久的将来,类似的逻辑也适用于 '02 。因此,1950年的中间点。注意这是一个固定的点,而不是一个滑动的窗口。随着2000年的进一步发展,枢纽点变得越来越有用。 了解更多



无论如何,关键在于'RRRR'与其他日期格式无法很好地发挥作用: to_date('501212','rrrrmmdd')hurls ora-01843:不是有效月份。因此,使用'RR'并在使用'YYYY'`之前进行测试。所以我修改后的函数看起来像这样:
$ b $ pre $ 创建或替换函数clean_date
(p_date_str in varchar2)
返回日期

l_dt_fmt_nt sys.dbms_debug_vc2coll:= sys.dbms_debug_vc2coll
('DD-MM-RR','MM-DD-RR','RR- MM-DD','RR-DD-MM'
,'DD-MM-YYYY','MM-DD-YYYY','YYYY-MM-DD','YYYY-DD-MM');
return_value date;
begin
for idx in l_dt_fmt_nt.first().. l_dt_fmt_nt.last()
loop
begin
return_value:= to_date(p_date_str,l_dt_fmt_nt(idx)) ;
出口;
例外
其他则为空;
end;
结束循环;
如果return_value为null,则
提高no_data_found;
end if;
返回return_value;
exception
no_data_found then
raise_application_error(-20000,p_date_str ||'是未知的日期格式');
结束clean_date;
/

关键在于:我们可以使这个功能当谈到解释日期,所以确保你最合适的领导。如果你认为你的日期字符串大部分符合日月年的话,你仍然会得到一些错误的表演,但是如果你在年中一天领先,那么你的表演就会少一些。

I want to bring a string which contains a date to a single format date. EX:

  • 13-06-2012 to 13-JUN-12
  • 13/06/2012 to 13-JUN-12
  • 13-JUN-2012 to 13-JUN-12
  • 13/jun-2012 to 13-JUN-12
  • ...

I tried to delete all special characters and after that use a function to transform that string into a single format of date. My function return more exceptions, I don't know why...

The function:

CREATE OR REPLACE FUNCTION normalize_date (data_in IN VARCHAR2)
    RETURN DATE
IS
    tmp_month         VARCHAR2 (3);
    tmp_day           VARCHAR2 (2);
    tmp_year          VARCHAR2 (4);
    TMP_YEAR_NUMBER   NUMBER;
    result            DATE;
BEGIN
    tmp_day := SUBSTR (data_in, 1, 2);
    tmp_year := SUBSTR (data_in, -4);

    --if(REGEXP_LIKE(SUBSTR(data_in,3,2), '[:alpha:]')) then 
    if(SUBSTR(data_in,3,1) in ('a','j','i','f','m','s','o','n','d','A','J','I','F','M','S','O','N','D')) then      
    tmp_month := UPPER(SUBSTR (data_in, 3, 3));
    else
    tmp_month := SUBSTR (data_in, 3, 2);
    end if;

    DBMS_OUTPUT.put_line (tmp_year);

    TMP_YEAR_NUMBER := TO_NUMBER (tmp_year);

    IF (tmp_month = 'JAN')
    THEN
        tmp_month := '01';
    END IF;

    IF (tmp_month = 'FEB')
    THEN
        tmp_month := '02';
    END IF;

    IF (tmp_month = 'MAR')
    THEN
        tmp_month := '03';
    END IF;

    IF (tmp_month = 'APR')
    THEN
        tmp_month := '04';
    END IF;

    IF (tmp_month = 'MAY')
    THEN
        tmp_month := '05';
    END IF;

    IF (tmp_month = 'JUN')
    THEN
        tmp_month := '06';
    END IF;

    IF (tmp_month = 'JUL')
    THEN
        tmp_month := '07';
    END IF;

    IF (tmp_month = 'AUG')
    THEN
        tmp_month := '08';
    END IF;

    IF (tmp_month = 'SEP')
    THEN
        tmp_month := '09';
    END IF;

    IF (tmp_month = 'OCT')
    THEN
        tmp_month := '10';
    END IF;

    IF (tmp_month = 'NOV')
    THEN
        tmp_month := '11';
    END IF;

    IF (tmp_month = 'DEC')
    THEN
        tmp_month := '12';
        END IF;

   -- dbms_output.put_line(tmp_day || '~'||tmp_year || '~' ||tmp_month);

    IF (LENGTH (tmp_day || tmp_year || tmp_month) <> 8)
    THEN
        result := TO_DATE ('31122999', 'DDMMYYYY');
        RETURN result;
    END IF;

 --   dbms_output.put_line('before end');
    result:=TO_DATE (tmp_day || tmp_month ||tmp_year , 'DDMMYYYY');
 --   dbms_output.put_line('date result: '|| result);
    RETURN result;
EXCEPTION
    WHEN NO_DATA_FOUND
    THEN
        NULL;
    WHEN OTHERS
    THEN
        result := TO_DATE ('3012299', 'DDMMYYYY');
        RETURN result;
        RAISE;
END normalize_date;

Usage

SELECT customer_no,
       str_data_expirare,
       normalize_date (str_data_expirare_trim) AS data_expirare_buletin
  FROM (SELECT customer_no,
               str_data_expirare,
               REGEXP_REPLACE (str_data_expirare, '[^a-zA-Z0-9]+', '')
                   AS str_data_expirare_trim
          FROM (SELECT Q1.set_act_id_1,
                       Q1.customer_no,
                       NVL (SUBSTR (set_act_id_1,
                                      INSTR (set_act_id_1,
                                             '+',
                                             1,
                                             5)
                                    + 1,
                                    LENGTH (set_act_id_1)),
                            'NULL')
                           AS str_data_expirare
                  FROM STAGE_CORE.IFLEX_CUSTOMERS Q1
                  WHERE Q1.set_act_id_1 IS NOT NULL
                  )
        );

If you have a sound idea of all the possible date formats it might be easier to use brute force:

create or replace function clean_date
    ( p_date_str in varchar2)
    return date
is
    l_dt_fmt_nt sys.dbms_debug_vc2coll := sys.dbms_debug_vc2coll
        ('DD-MON-YYYY', 'DD-MON-YY', 'DD-MM-YYYY', 'MM-DD-YYYY', 'YYYY-MM-DD'
         , 'DD/MM/YYYY', 'MM/DD/YYYY', 'YYYY/MM/DD', 'DD/MM/YY', 'MM/DD/YY');
    return_value date;
begin
    for idx in l_dt_fmt_nt.first()..l_dt_fmt_nt.last()
    loop
        begin
            return_value := to_date(p_date_str, l_dt_fmt_nt(idx));
            exit;
        exception
             when others then null;
        end;
    end loop;
    if return_value is null then
        raise no_data_found; 
    end if;
    return return_value;
exception
    when no_data_found then
        raise_application_error(-20000, p_date_str|| ' is unknown date format');
end clean_date;
/

Be aware that modern versions of Oracle are quite forgiving with date conversion. This function handled dates in formats which aren't in the list, with some interesting consequences:

SQL> select  clean_date('20160817') from dual;

CLEAN_DAT
---------
17-AUG-16

SQL> select  clean_date('160817') from dual;

CLEAN_DAT
---------
16-AUG-17

SQL> 

Which demonstrates the limits of automated data cleansing in the face of lax data integrity rules. The wages of sin is corrupted data.


@AlexPoole raises the matter of using the 'RR' format. This element of the date mask was introduced as a Y2K kludge. It's rather depressing that we're still discussing it almost two decades into the new Millennium.

Anyway, the issue is this. If we cast this string '161225' to a date what century does it have? Well, 'yymmdd' will give 2016-12-15. Fair enough, but what about '991225'? How likely is that the date we really want is 2099-12-15? This is where the 'RR' format comes into play. Basically it defaults the century: numbers 00-49 default to 20, 50-99 default to 19. This window was determined by the Y2K issue: in 2000 it was more likely that '98 referred to the recent past than the near future, and similar logic applied to '02. Hence the halfway point of 1950. Note this is a fixed point not a sliding window. As we move further from the year 2000 the less useful that pivot point becomes. Find out more.

Anyway, the key point is that 'RRRR' does not play nicely with other date formats: to_date('501212', 'rrrrmmdd') hurlsora-01843: not a valid month. So, use'RR'and test for it before using'YYYY'`. So my revised function (with some tidying up) looks like this:

create or replace function clean_date
    ( p_date_str in varchar2)
    return date
is
    l_dt_fmt_nt sys.dbms_debug_vc2coll := sys.dbms_debug_vc2coll
        ('DD-MM-RR', 'MM-DD-RR', 'RR-MM-DD', 'RR-DD-MM'
         , 'DD-MM-YYYY', 'MM-DD-YYYY', 'YYYY-MM-DD', 'YYYY-DD-MM');
    return_value date;
begin
    for idx in l_dt_fmt_nt.first()..l_dt_fmt_nt.last()
    loop
        begin
            return_value := to_date(p_date_str, l_dt_fmt_nt(idx));
            exit;
        exception
             when others then null;
        end;
    end loop;
    if return_value is null then
        raise no_data_found; 
    end if;
    return return_value;
exception
    when no_data_found then
        raise_application_error(-20000, p_date_str|| ' is unknown date format');
end clean_date;
/

The key point remains: there's a limit to how smart we can make this function when it comes to interpreting dates, so make sure you lead with the best fit. If you think most of your date strings fit day-month-year put that first; you will still get some wrong casts but less that if you lead with year-month-day.